Attorney Docket No. 039386-2277 



THE UNITED STA TES PA TENT AND TRADEMARK OFFICE 
BEFORE THE BOARD OF PATENT APPEALS AND INTERFERENCES 



Applicant: 


Elliott et al. 


Title: 


KINASES AND PHOSPHATASES 


Appl. No.: 


10/554,917 


International 


3/24/2004 


Filing Date: 




371(c) Date: 


04/27/07 


Examiner: 


Swope, Sheridan 


Art Unit: 


1652 


Confirmation 


9780 


Number: 





BRIEF ON APPEAL 

Mail Stop Appeal Brief - Patents 
P.O. Box 1450 
Alexandria, VA 22313-1450 

Under the provisions of 37 C.F.R. §41.37, this Appeal Brief is being filed together with a 
credit card payment form in the gunount of $540.00 covering the 37 C.F.R. § 41 .20(b)(2) appeal 
fee. If this fee is deemed to be insufficient, authorization is hereby given to charge any 
deficiency (or credit any balance) to the undersigned deposit account 19-0741. 



WASH 6873570 



Attorney Docket No. 039386-2277 
Patent Application No. 10/554,917 



TABLE OF CONTENTS 

I. REAL PARTY IN INTEREST 1 

II. RELATED APPEALS AND INTERFERENCES 2 

III. STATUS OF CLAIMS 3 

IV. STATUS OF AMENDMENTS 4 

V. SUMMARY OF CLAIMED SUBJECT MATTER 5 

VI. GROUNDS OF REJECTION TO BE REVIEWED ON APPEAL 6 

VII. ARGUMENT 7 

A. Appellants' specification teaches that SEQ ID NO: 13 is a novel variant 
of the jS-adrenergic receptor kinase 2 and also illustrates a specific, 
substantial and credible utility 8 

1 . AppeUants' specification teaches that SEQ ID NO: 13 is 94% 
identical at the amino acid level to Homo sapiens beta-adrenergic 
receptor kinase 2 8 

2. Appellants' specification teaches that the probability of randomly 
obtaining a polypeptide corresponding to SEQ ID NO: 13 is nil ....... 10 



WASH_6873570 



i 



Attorney Docket No. 039386-2277 
Patent Application No. 10/554,917 



3. Appellants' specification teaches that almost all of the signature 
sequences, domains or motifs are 100% identical between SEQ ID 
NO: 13 and the j8-adrenergic receptor kinase 2 sequence 12 

B. The skilled artisan would reasonably believe that the assay methods used 
to establish a specific, substantial and credible utility for SEQ ID NO: 56 
are valid 13 

C. The references cited in the January 13, 2010 reply are relevant 16 

D. The skilled artisan would find the stated utility significant, substantial and 
credible without additional statistical analysis 18 

E. Summary regarding the utility rejection under 35 U.S.C. § 101 19 

F. The claimed invention meets the enablement requirement under 35 U.S.C. 

§ 112, first paragraph 20 

VIII. CONCLUSION 21 

IX. CLAIMS APPENDIX 22 

X. EVIDENCE APPENDIX 34 

XL RELATED PROCEEDINGS APPENDIX 36 



WASH_6873570 



-ii- 



Attorney Docket No. 039386-2277 
Patent Application No. 1 0/554,9 1 7 



TABLE OF AUTHORITIES 

Guidelines; 

M.P.E.P. § 2107.01.11 21 



-iii- 



Attorney Docket No. 039386-2277 
Patent Application No. 10/554,917 



REAL PARTY IN INTEREST 

The real party in interest is INCYTE CORPORATION. 
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RELATED APPEALS AND INTERFERENCES 

The Appellants are xmaware of any related appeals or interferences. 
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STATUS OF CLAIMS 

Claims 1-4, 6-7, 11, 14-20, 23, 26-32, 34, 36, 44-55, and 142-144 are currently pending. 
Claims 1, 2, 1 1, 14-20, 23, 26-32, 34, 36 and 44-55 currently are withdrawn. 
Claims 3, 4, 6, 7 and 142-144 are pending and under examination. 
The rejection of claims 3, 4, 6, 7 and 142-144 is appealed. 
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IV. STATUS OF AMENDMENTS 

No amendment has been filed subsequent to the issuance of the final office Action dated 
October 16, 2009. 
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V. SUMMARY OF CLAIMED SUBJECT MATTER 

Independent claim 3 is to be argued in the brief. The relevant citation to the specification 
is shown in the parentheses below. 

Independent claim 3 reads as follows: 

3. An isolated polynucleotide encoding a polypeptide {page 23, line 30}, wherein 
the polypeptide consists of the amino acid sequence of SEQ ID NO: 13 {page 23, lines 31-32}. 
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VI. GROUNDS OF REJECTION TO BE REVIEWED ON APPEAL 

(1) Whether claims 3, 4, 6, 7, 142-144 are unpatentable under 35 U.S.C. § 101 as 
allegedly lacking a specific, substantial and credible utility. 

(2) Whether claims 3, 4, 6, 7, 142-144 are unpatentable under 35 U.S.C. § 112, first 
paragraph, due to the alleged lack of utiUty. 
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VII. ARGUMENT 

The specification provides numerous asserted utilities for the claimed polynucleotides 
and encoded polypeptides. One of these utilities, "as a tissue marker for brain" is explicitly 
asserted at page 102, line 18. 

With respect to this utility, the final Office Action dated October 16, 2009 asserted that 
"the skilled artisan would not conclude that SEQ ID NO: 56 is a marker for brain tissue based on 
the specification disclosing that SEQ ID NO: 56 has a mere two-fold higher expression in brain 
than in the reference sample." Office Action dated October 16, 2009 at page 3, emphasis added. 
No further support for the rejection is provided in this Office Action, 

In the Advisory Action dated January 26, 2010, the rejection was maintained on the same 
grounds, namely that a ''[t]wo-fold higher expression of SEQ ID NO: . . .[56] in brain than in the 
reference sample does not provide evidence that SEQ ID NO: . . . [56] is a specific marker for 
brain." Advisory Action dated January 26, 2010 at page 2, emphasis added. In support, the 
Advisory Action puts forth three assertions. First, the Advisory action asserts that "[t]he 
reference sample, comprising heart, kidney, ling, placenta, small intestine, spleen, stomach, testis 
and uterus, comprises some tissue having a level of SEQ ID NO: . . . [56] that is higher . . . [than] 
the reference samples which is an average of all included tissues," and that "[m]ore likely than 
not, compared to brain, one or more tissues within the reference sample have the same or higher 
levels ofSEQ ID NO: ... [56]." Id,. 
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Second, the Advisory Action asserts that "[n]one of Yue et al., Lee et al or Vasseur et al 
discusses using polynucleotides that are tissue-specific markers," and that "[e]ach of said 
references discusses using a polynucleotide as a probe to detect differences in expression of the 
complementary polynucleotide in, for example, different tissues (Y ue) or due to parameters such 
as ageing and caloric restriction (Lee), or transformation with ras (Vasseur)." Id, 

Third, the Advisory Action asserts that "[t]he skilled artisan would have been aware of 
statistical methods that can be used to analyze variability and determine whether a difference is 
significant; for example the Student's t-test," and that "for a substance to be considered a tissue- 
specific marker, the expression of the substance in the tissue must be essentially exclusive, /.e., 
not expressed in other tissues." Id. 

Appellants respectfully traverse these grounds for rejection. 

A. Appellants' specification teaches that SEQ ID NO: 13 is a novel variant of 
the jS-adrenergic receptor kinase 2 and also illustrates a specific, substantial 
and credible utility 

1. Appellants' specification teaches that SEQ ID NO: 13 is 94% identical 
at the amino acid level to Homo sapiens beta-adrenergic receptor 
kinase 2 

Table 1 of the specification at page 113 show that the polypeptide of SEQ ID NO: 1 3 is 
encoded by the polynucleotide of SEQ ID NO: 56. Specifically, Table 1 is described at page 43 
of the specification as follows: "Table 1 summarizes the nomenclature for the full length 
polynucleotide and polypeptide embodiments of the invention. Each polynucleotide and its 
corresponding polypeptide are correlated to single Incyte project identification number (Inycte 
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Project ID)." Specification at page 43, lines 25-27. Thus, the amino acid of SEQ ID NO: 13 is 
encoded by the polynucleotide sequence of SEQ ID NO: 56. 

Table 2 of the specification at page 121 provides homology data regarding the 
polypeptide of SEQ ID NO: 13. SEQ ID NO: 13 is 94% identical at the amino acid level to 
Homo sapiens beta-adrenergic receptor kinase 2 identified in Table 2 as g3 12395, and as shown 
in the sequence alignment below. Differences between the amino acid sequences are bold, 
underlined. 

Alignment of SEQ ID NO: 13 and g312395 

Score = 1338 bits (3462), Expect = 0.0, Method: Compositional matrix adjust. 



Identities 


= 647/688 (94%), Positives = 649/688 (94%), Gaps = 38/688 (5%) 




Query 


1 


MADLEAVLADVSYL]yL?^EKSKATPAARASKRIVLPEPSIRSVMQKYLAERNEITL^ 


60 






MADLEAVLADVSYLMAMEKSKATPAARASKRIVLPEPSIRSVMQKYLAERNEIT DKIFN 




Sbjct 


1 


MADLEAVLADVSYLM7VMEKSKATPAARASKRIVLPEPSIRSVMQKYLAERNEITFDKIFN 


60 


Query 


61 


QKIGFLLFKDFCLNEINEAVPQVKFYEEIKEYEKLDNEEDRLCRSRQIYDAYIMKELLSC 


120 






QKIGFLLFKDFCLNEINEAVPQVKFYEEIKEYEKLDNEEDRLCRSRQIYDAYIMKELLSC 




Sbjct 


61 


QKIGFLLFKDFCLNEINEAVPQVKFYEEIKEYEKLDNEEDRLCRSRQIYDAYIMKELLSC 


120 


Query 


121 


SHPFSKQAVEHVQSHLSKKQVTSTLFQPYIEEICESLRGDIFQKFMESDKFTRFCQWKNV 


180 






SHPFSKQAVEHVQSHLSKKQVTSTLFQPYIEEICESLRGDIFQKFMESDKFTRFCQWKNV 




Sbjct 


121 


SHPFSKQAVEHVQSHLSKKQVTSTLFQPYIEEICESLRGDIFQKFMESDKFTRFCQWKNV 


180 


Query 


181 


ELNIHLTMNEFSVHRIIGRGGFGEVYGCRKADTGKMYAMKCLDKKRIKMKQGETLALNER 


240 






ELNIHLTMNEFSVHRIIGRGGFGEVYGCRKADTGKMYAMKCLDKKRIKMKQGETLALNER 




Sbjct 


181 


ELNIHLTMNEFSVHRIIGRGGFGEVYGCRKADTGKMYAMKCLDKKRIKMKQGETLALNER 


240 


Query 


241 


IMLSLVSTGDCPFIVCMTYAFHTPDKLCFILDLMNGGDLHYHLSQHGVFSEKEMRFYATE 


300 






IMLSLVSTGDCPFIVCMTYAFHTPDKLCFILDLMNGGDLHYHLSQHGVFSEKEMRFYATE 




Sbjct 


241 


IMLSLVSTGDCPFIVCMTYAFHTPDKLCFILDLMNGGDLHYHLSQHGVFSEKEMRFYATE 


300 


Query 


301 


IILGLEHMHNRFWYRDLKPANILLDEHGHARISDLGLACDFSKKKPHASVGTHGYMAPE 


360 






IILGLEH+HNRFWYRDLKPANILLDEHGHARISDLGLACDFSKKKPHASVGTHGYMAPE 




Sbjct 


301 


IILGLEHVHNRFWYRDLKPANILLDEHGHARISDLGLACDFSKKKPHASVGTHGYMAPE 


360 


Query 


361 


VLQKGTAYDSSADWFSLGCMLFKLLRGHSPFRQHKTKDKHEIDRMTLTVNVELPDTFSPE 


420 






VLQKGTAYDSSADWFSLGCMLFKLLRGHSPFRQHKTKDKHEIDRMTLTVNVELPDTFSPE 




Sbjct 


361 


VLQKGTAYDSSADWFSLGCMLFKLLRGHSPFRQHKTKDKHEIDRMTLTVNVELPDTFSPE 


420 


Query 


421 


LKSLLEGLLQRDVSKRLGCHGGGSQEVKEHSFFKGVDWQHVYLQKYPPPLIPPRGEVNAA 


480 
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LKSLLEGLLQRDVSKRLGCHGGGSQEVKEHSFFKGVDWQHVYLQKYPPPLIPPRGEVNAA 



oDJ C u 


4 z ± 




A p n 
^ o u 


Query 


481 


DAFDIGSFDEEDTKGIKLLDCDQELYKNFPLVISERWQQEVTETVYEAWADTDKIEARK 


540 






nZi'CnTr' QT^nTTTrRTVi^T VT T nnnOT?! .WMTTDT A7TQTrPMnnTr\7TT?T'\7V"B'Zi"\7>JZX'nTnV'T"B'^T?Tf 




oDJ C u 


'to ± 


■n7iT?"nTr'CT?"nT?T7"nTvr'T VT t ■nr'"noT?T vvntttdt \7TQT?"PTAronTrT7T'"C''P"\7VT?a'\7Ma'n'P"nifTTrziT?Tr 

UJ\r U±\J^r LfEiEiU 1 JS\j±J^LtLtlJ\^lJ\^CiLj X JSJNr IrJ-i V J-OCiKW^UCi V 1 Cj 1 V X CjM.VJM/UJ 1 JJJ\.J.£jru\.J\ 


c: A n 
D4 u 


Query 


541 


RAKNKQLGHEEDYALGKDCIMHGYMLKLGNPFLTQWQRRYFYLFPNRLEWRGEGESR- - - 


597 






RAKNKQLGHEEDYALGKDCIMHGYMLKLGNPFLTQWQRRYFYLFPNRLEWRGEGESR 




Sbj ct 


541 


RAKNKQLGHEEDYALGKDCIMHGYMLKLGNPFLTQWQRRYFYLFPNRLEWRGEGESRQNL 


600 


Query 


598 


SDPEFVQWKKELNETFKEARRLLRR 


622 






SDPEFVQWKKELNETFKEA+RLLRR 




Sbjct 


601 


LTMEQILSVEETQIKDKKCILFRIKGGKQFVLQCESDPEFVQWKKELNETFKEAQRLLRR 


660 


Query 


623 


APKFLNKPRSGTVELPKPSLCHRNSNGL 650 








APKFLNKPRSGTVELPKPSLCHRNSNGL 




Sbjct 


661 


APKFLNKPRSGTVELPKPSLCHRNSNGL 688 





2. Appellants' specification teaches that the probability of randomly 
obtaining a polypeptide corresponding to SEQ ID NO: 13 is nil 

Table 2 of the Specification at page 121 shows that the BLAST probability score is 0.0, 
"which indicates the probability of obtaining the observed polypeptide sequence alignment by 
chance." Specification at page 44, lines 19-21. Table 2 also references Parruti et al., "Molecular 
cloning, functional expression and mRNA analysis of human beta adrenergic receptor kinase 2," 
Biochem, Biophys, Res.Commun,, 790:475-481 (1993) ("Parruti," EXHIBIT A), which is 
incorporated into the specification by reference (see Table 2 at page 121 ; Specification at page 
44, lines 9-10). Parruti describes the sequence of human beta-adrenergic receptor kinase 2 
polypeptide, the cDNA sequence of which was submitted the to the GenBank/EMBL data Bank 
with accession number X691 17. An alignment of the polypeptide associated with accession 
number X691 17 and g3 12395 illustrates that these two sequences are 100% identical at the 
amino acid level. 



-10- 

WASH 6873570 



Attorney Docket No. 039386-2277 
Patent Application No. 10/554,917 



Alignment of X69117 (QUERY) G312395 (SBJCT) 

Score = 1437 bits (3721), Expect = 0.0, Method: Compositional matrix adjust. 
Identities = 688/688 (100%), Positives = 688/688 (100%), Gaps = 0/688 (0%) 

Query 1 l^LEAVLADVSYLMAMEKSKATPAARASKRIVLPEPSIRSVMQKYLAERNEITFD 60 

MADLEAVLADVSYLMAMEKSKATPAARASKRIVLPEPSIRSVMQKYLAERNEITFDKIFN 
Sb j ct 1 r^LEAVLADVSYLMAMEKSKATPAARASKRIVLPEPSIRSVMQKYLAERNEITFDKIFN 60 

Query 61 QKIGFLLFKDFCLNEINEAVPQVKFYEEIKEYEKLDNEEDRLCRSRQIYDAYIMKELLSC 12 0 

QKIGFLLFKDFCLNEINEAVPQVKFYEEIKEYEKLDNEEDRLCRSRQIYDAYIMKELLSC 
Sbj ct 61 QKIGFLLFKDFCLNEINEAVPQVKFYEEIKEYEKLDNEEDRLCRSRQIYDAYIMKELLSC 12 0 

Query 121 SHPFSKQAVEHVQSHLSKKQVTSTLFQPYIEEICESLRGDIFQKFMESDKFTRFCQWKNV 180 

SHPFSKQAVEHVQSHLSKKQVTSTLFQPYIEEICESLRGDIFQKFMESDKFTRFCQWKNV 
Sbj ct 121 SHPFSKQAVEHVQSHLSKKQVTSTLFQPYIEEICESLRGDIFQKFMESDKFTRFCQWKNV 180 

Query 181 ELNIHLTMNEFSWRI IGRGGFGEVYGCRKADTGKMYAMKCLDKKRIKMKQGETLALNER 240 

ELNIHLTMNEFSVHRIIGRGGFGEVYGCRKADTGKMYAMKCLDKKRIKiyiKQGETLALNER 
Sbj ct 181 ELNIHLTMNEFSVHRIIGRGGFGEVYGCRKADTGKMYAMKCLDKKRIKMKQGETLALNER 240 

Query 241 IMLSLVSTGDCPFIVCMTYAFHTPDKLCFILDLMNGGDLHYHLSQHGVFSEKEMRFYATE 3 00 

IMLSLVSTGDCPFIVCMTYAFHTPDKLCFILDLMNGGDLHYHLSQHGVFSEKEMRFYATE 
Sb j C t 241 IMLSLVSTGDCPFIVCMTYAFHTPDKLCFILDLMNGGDLHYHLSQHGVFSEKEMRFYATE 3 0 0 

Query 3 01 I ILGLEHVHNRFWYRDLKPANILLDEHGHARI SDLGLACDFSKKKPHASVGTHGYMAPE 360 

IILGLEHVHNRFWYRDLKPANILLDEHGHARISDLGLACDFSKKKPHASVGTHGYMAPE 
Sbjct 3 01 IILGLEHVHNRFWYRDLKPANILLDEHGHARISDLGLACDFSKKKPHASVGTHGYMAPE 360 

Query 3 61 VLQKGTAYDSSADWFSLGCMLFKLLRGHSPFRQHKTKDKHEIDRMTLTVNVELPDTFSPE 420 

VLQKGTAYDSSADWFSLGCMLFKLLRGHSPFRQHKTKDKHEIDRMTLTVNVELPDTFSPE 
Sbj ct 3 61 VLQKGTAYDSSADWFSLGCMLFKLLRGHSPFRQHKTKDKHEIDRMTLTVNVELPDTFSPE 420 

Query 421 LKSLLEGLLQRDVSKRLGCHGGGSQEVKEHSFFKGVDWQHVYLQKYPPPLIPPRGEVNAA 480 

LKSLLEGLLQRDVSKRLGCHGGGSQEVKEHSFFKGVDWQHVYLQKYPPPLIPPRGEVNAA 
Sbj ct 421 LKSLLEGLLQRDVSKRLGCHGGGSQEVKEHSFFKGVDWQHVYLQKYPPPLIPPRGEVNAA 480 

Query 481 DAFDIGSFDEEDTKGIKLLDCDQELYKNFPLVISERWQQEVTETVYEAVNADTDKIEARK 540 

DAFDIGSFDEEDTKGIKLLDCDQELYKNFPLVISERWQQEVTETVYEAVNADTDKIEARK 
Sbj ct 481 DAFDIGSFDEEDTKGIKLLDCDQELYKNFPLVISERWQQEVTETVYEAVNADTDKIEARK 54 0 

Query 541 RAKNKQLGHEEDYALGKDCIMHGYMLKLGNPFLTQWQRRYFYLFPNRLEWRGEGESRQNL 600 

RAKNKQLGHEEDYALGKDCIMHGYMLKLGNPFLTQWQRRYFYLFPNRLEWRGEGESRQNL 
Sbj ct 541 RAKNKQLGHEEDYALGKDCIMHGYMLKLGNPFLTQWQRRYFYLFPNRLEWRGEGESRQNL 600 

Query 601 LTMEQILSVEETQIKDKKCILFRIKGGKQFVLQCESDPEFVQWKKELNETFKEAQRLLRR 66 0 

LTMEQILSVEETQIKDKKCILFRIKGGKQFVLQCESDPEFVQWKKELNETFKEAQRLLRR 
Sbj C t 601 LTMEQILSVEETQIKDKKCILFRIKGGKQFVLQCESDPEFVQWKKELNETFKEAQRLLRR 660 

Query 661 APKFLNKPRSGTVELPKPSLCHRNSNGL 688 

APKFLNKPRSGTVELPKPSLCHRNSNGL 
Sbj ct 661 APKFLNKPRSGTVELPKPSLCHRNSNGL 688 
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Accordingly, SEQ ID NO: 13 is 94% identical at the amino acid level to the human /3- 
adrenergic receptor kinase 2. 

3. Appellants' specification teaches that almost all of the signature 
sequences, domains or motifs are 100% identical between SEQ ID 
NO: 13 and the j3-adrenergic receptor kinase 2 sequence 

Table 3 at page 146 of the Specification describes signature sequences, domains and 
motifs present in SEQ ID NO: 13. Numerous signature sequences, domains and motifs are 
presented for SEQ ID NO: 13, all of which are present in the /3-adrenergic receptor kinase 2 
sequence. Moreover, almost all of the signature sequences, domains or motifs are 100% identical 
between SEQ ID NO: 13 and the /S-adrenergic receptor kinase 2 sequence. Exceptions include 
domains which span the mismatched amino acids (eg., the Regulator of G protein signaling 
domain, T54-C175, which includes a single amino acid mismatch at position 55 of SEQ ID NO: 
13; jS-adrenergic receptor kinase jS-ARK G-protein coupled transferase serine/threonine protein 
ATP-binding multi-gene PD151831 T612-L650, which includes a single amino acid mismatch at 
position 617 of SEQ ID NO: 13). It is noted that the serine/threonine protein kinase catalytic 
domain F191-.F453 is 100% identical between SEQ ID NO: 13 and the jS-adrenergic receptor 
kinase 2 sequence. Accordingly, SEQ ID NO: 13 is a novel variant of the jS-adrenergic receptor 
kinase 2. 
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B, The skilled artisan would reasonably believe that the assay methods used to 
establish a specific, substantial and credible utility for SEQ ID NO: 56 are 
vaUd 

The Advisory Action alleges that the "reference sample comprising heart, kidney, lung, 
placenta, small intestine, spleen, stomach, testis and uterus, comprises some tissue having a level 
of SEQ ID NO: 56 that is higher than the reference sample, which is an average of all tissues 
included." Advisory Action at page 2, emphasis added. The Advisory Action continues, 
asserting that "[m]ore likely than not, compared to brain, one or more tissues within the reference 
sample have the same or higher levels of SEQ ID NO: 5[6]," and that "[t]herefore the skilled 
artisan would not concluded that SEQ ID NO: 5[6] can be used [a]s a marker to identify brain 
tissue," (Id), Appellants respectfully disagree with this assessment of the assay. 

The specification at pages 95-98 describes the microarray assays that were performed to 
determine the relative expression levels of the novel polypeptides disclosed in the present 
application in different tissues. Briefly, the microarray assays were performed as follows. 

The novel sequences disclosed in the present specification (''SEQs") were affixed to a 
specific location on a solid support (see specification at page 96, line 35, continuing to page 97, 
lines 1-14) to generate the microarrays to be tested. A common reference sample was prepared 
which included RNA isolated from a variety of tissues: brain, heart, kidney, lung, placenta, small 
intestine, spleen, stomach, testis and uterus (Specification at page 102, lines 4-8). The common 
reference sample was then exposed to the microarray SEQs (Specification at page 102, lines 3- 
12) under stringent hybridization conditions (Specification at page 97, lines 16-24). 

-13- 
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When the common reference sample was exposed to the microarray, each SEQ on the 
microarray provided a "signal" proportional to the amount of RNA which hybridized to that 
SEQ. Specification at page 98, line 18 and lines 22-24. For purposes of this discussion, the 
signal generated by hybridization of the common reference sample to a SEQ will be termed the 
"CR signal." The skilled artisan would understand that in general, those SEQs which are 
expressed in more tissues will have a higher CR signal than those SEQs which are expressed in 
fewer tissues, and those SEQs which have higher levels of expression in any given tissue or 
tissues will have a higher CR signal than SEQs which have lower expression levels in the same 
tissue or tissues. 

The next step of the assay was to expose the microarray to RNA derived from specific 
tissues. The tissue specific RNA was "obtained from at least three different donors," and "RNA 
from each donor was separately isolated and individually hybridized to the microarray." 
Specification at page 102, lines 8-10. Again, each SEQ on the microarray provided a "signal" 
which was proportional to the amount of RNA in the specific tissue which hybridized to that 
SEQ. Specification at page 98, line 18 and lines 22-23. The "signal" generated by the tissue- 
specific samples was then compared to the CR value. Signal-to-background and array element 
spot size were also evaluated (specification at page 98, lines 25-27). SEQs "that exhibit at least 
about a two-fold change in expression, a signal-to-background ratio of at least about 2.5, and an 
element spot size of at least about 40%, are considered to be differentially expressed." Id, 

Thus, if the signal generated by tissue X is lower than the CR signal (and background and 
spot size are within acceptable parameters), it is unlikely that that particular SEQ is expressed in 
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tissue X, or is expressed at very extremely low levels. If the signal generated by tissue X is at 
least about two-fold higher than the CR signal (and background and spot size are within 
acceptable parameters), then that sequence is considered to be differentially expressed. 
Specification at page 98, lines 25-27. In the case of SEQ ID NO: 56, the tissue-specific signal 
was increased by at least two-fold in brain as compared to the reference sample. Specification at 
page 102, lines 17-18. No other tissue-specific expression is described in the specification for 
SEQ ID NO: 56, thus no other specific tissue provided a signal that was higher than the CR 
signal and met the background and spot size parameters. 

The Advisory Action asserts that the "reference sample comprises some tissue having a 
level of SEQ ID NO: 56 that is higher than the reference sample," and that "[m]ore likely than 
not, compared to brain, one or more tissues within the reference sample have the same or higher 
levels of SEQ ID NO: . , .[56]," However, the tissue-specific testing rules this possibihty out. 
For example, if one or more tissues, say heart and lung, within the reference sample have the 
same or higher levels of SEQ ID NO: 56 as expressed in brain, it follows that the signal 
produced in the heart and lung tissue-specific expression test would be at least as high as it was 
for brain tissue. The results of the microarray analysis indicate that this is not the case. Of the 
tissue types tested, only brain showed a difference in expression level as compared to the 
common reference sample, and that difference was "at least two-fold." Specification at page 
102, lines 17-18, This "at least two-fold difference," in conjunction with an appropriate signal- 
to-background level and spot size, was deemed sufficient to be considered "differential 
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expression." Specification at page 98, lines 25-27. Accordingly, Appellants respectfully contend 
that this reason for rejection is without merit and should be withdrawn. 

C. The references cited in the January 13, 2010 reply are relevant 

In the Reply dated January 13, 2010 to the final Office Action dated October 16, 2009, 

Appellants provided references to demonstrate that a two-fold difference in expression in 
microarray analysis is considered significant by those of skill in the art. The references teach that 
one of skill in the art could reasonably believe that an at least two-fold difference in expression is 
credible. 

For example, numerous microarray studies have deemed fold-difference values of 
between 1 .4 and 2 fold as significant. See e.g., (1) Yue et al., Nucleic Acid Research, 29(8) e41 
(2001), reporting a 1.4 fold change in expression as significant, see abstract (EXHIBIT B); 
(2) Lee et al.. Science, 255:1390-93, page 1392 (1999), reporting 1,8 fold induction and 1,6 fold 
reduction in gene expression as significant (EXHIBIT C); and (3) Vasseur et al., Molecular 
Cancer, 2(19) (2003), stating at page 2 that "differential expression values of greater than L7 are 
likely to be significant, based on internal quality control data," however, that a "more stringent 
ratio" of "at least 2.0 fold" was used (EXHIBIT D). Indeed, reviews on the topic conclude that 
"there is no magical absolute cut-off for a meaningful fold value" and that essentially, the 
parameters of each analysis must be considered in determining a meaningful cut-off value for 
that particular analysis. See e.g., Tsien et al., "On reporting fold differences," Pacific Symposium 
on Biocomputing, 5:496-507, at 504 (2001) (EXHIBIT E). Note that each of these reference was 
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submitted with the reply dated January 13, 2010, and entered into the record on January 14, 2010. 
This is confirmed in the Advisory action dated January 26, 2010. 

With respect to the present application, the inventors concluded that for this microarray 
analysis, an '' at least about 2-fold change in expression *' relative to the common reference 
sample, in conjunction with "a signal-to-background ratio of at least about 2.5, and an element 
spot size of at least about 40%", was sufficient to conclude that a particular sequence was 
differentially expressed. 

With respect to the teachings of these references, the Examiner asserted that "[n]one of 
Yue et al., Lee et al or Vasseur et al discusses using polynucleotides that are tissue-specific 
markers," and that "[e]ach of said references discusses usins: a polynucleotide as a probe to 
detect differences in expression of the complementary polynucleotide in, for example, different 
tissues (Yue) or due to parameters such as ageing and caloric restriction (Lee), or transformation 
with ras (Vasseur)." Advisory Action dated January 26, 201 0 at page 2, emphasis added. The 
Examiner concludes that "[i]n such assays, a 2-fold change may or may not be significant, 
depending on the variability in the compared samples." Advisory Action dated January 26, 2010 
at page 2. Appellants respectfully disagree with the Examiner's assertion that the cited 
references are not relevant to the present microarray analysis and results. 

The microarray assay described in the present specification does precisely what the 
Examiner alleges it does not: that is, it uses "a polynucleotide as a probe to detect difference in 
expression of the complementary polynucleotide in, for example, different tissues." Advisory 
Action dated January 26, 2010 at page 2. As described in the specification, RNA is isolated from 
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different, specific tissue samples. The RNA is then exposed to the polynucleotides (e.g., SEQ ID 
NO: 56) affixed on the microarray solid support. If the RNA complement of SEQ ID NO: 56 is 
present in the sample, it will hybridize to the polynucleotide of SEQ ID NO: 56 and generate 
signal proportional to the amount of RNA hybridized. If the RNA complement of SEQ ID NO: 
56 is not expressed in the specific tissue, no hybridization will occur and no signal will be 
generated. Accordingly, like the microarray analysis in Yue, SEQ ID NO: 56 is "used as a probe 
to detect differences in expression of the complementary polynucleotide in . . . different tissues." 

Accordingly, Appellants respectfully contend that this argument is without merit and 
should be withdrawn. 

The skilled artisan would find the stated utility significant, substantial and 
credible without additional statistical analysis 

With respect to the "greater than two-fold difference" shown in the microarray analysis 
for SEQ ID NO: 56 in brain, the Examiner asserts that "[t]he skilled artisan would have been 
aware of statistical methods that can be used to analyze variability and determine whether a 
difference is significant, for example, the Student's t-test," Appellants do not dispute the 
assertion that the skilled artisan would have been aware of statistical methods to analyze 
variability and determine whether a difference was significant or not. However, as described in 
section II above, the skilled artisan also knew that numerous microarray studies deemed fold- 
difference values of between 1.4 and 2 fold as significant. Accordingly, the implication that 
some statistical analysis must be present to support the asserted utility is without merit, especially 
in light of the values that were routinely relied upon in the microarray arena. 
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The Advisory Action also asserts that "for a substance to be considered a tissue-specific 
marker, the expression of the substance in the tissue must be essentially exclusive, /.e., not 
expressed in other tissues." Advisory Action dated January 26, 2010 at page 2. Appellants 
respectfully contend that this has no bearing on the present situation. The expression of SEQ ID 
NO: 56 is increased by at least two-fold in brain as compared to the reference sample. No other 
tissue tested showed such differential expression. Accordingly, the skilled artisan would 
reasonably believe that if an unknown tissue was tested and showed "at least a two-fold 
difference in expression of SEQ ID NO: 56 as compared to the reference sample," that tissue 
would likely be brain tissue. If an unknown tissue was tested and it showed a less than two-fold 
difference in expression of SEQ ID NO: 56 as compared to the reference sample, that tissues 
would most likely not be brain tissue. The skilled artisan would understand that relative levels of 
expression can also used as markers and that "essentially exclusive" expression not always a 
requirement or limitation. 

E. Summary regarding the utility rejection under 35 U.S.C. § 101 

Appellants respectfully contend that requirements for utility have more than been met. 

As noted in the reply dated January 1 3, 2010, 

The claimed invention must only be capable of performing some 
beneficial function ,., Kx\ invention does not lack utility merely 
because the particular embodiment disclosed in the patent lacks 

perfection or performs crudely A commercially successful 

product is not required ,,, ^or is it essential that the invention 
accomplish all its intended functions. . .or operate under all 
conditions . . .partial success being sufficient to demonstrate 
patentable utility ., An short, the defense of non-utility cannot be 
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sustained without proof of total incapacity . If an invention is only 
partially successful in achievins a useful result, a rejection of the 
claimed invention as a whole based on a lack of utility is not 
a ppropriate, 

M.P.E.P. § 2107.01.11 (citations omitted, emphasis added). Thus, while a higher "fold 
difference" and statistical analysis may be required in some circumstances - e.g., FDA approval - 
such conditions are not required to meet the utility requirement under 35 U.S.C. § 101 . In the 
present case, the expression level of SEQ ID NO: 56 is "at least two-fold higher" in brain tissue 
as compared to the control sample, and the art supports the "at least two-fold" difference in 
expression to be significant, z.e., credible. 

Accordingly, for at least the reasons stated above, the claimed invention has a specific, 
substantial and credible utility, and reconsideration and withdrawal of the rejection under 
35 U.S.C. § 101 is respectfully requested. 

F. The claimed invention meets the enablement requirement 
under 35 U.S.C. § 112, first paragraph 

The Office asserts that "since the claimed invention is not supported by either a specific 
and substantial asserted utility or a well-established utility. . .one skilled in the art clearly would 
not know how to use the claimed invention." Office Action dated October 16, 2009 at page 3. 
Appellants respectfully traverse this ground for rejection. 

As noted above in sections A-E above, the claimed polynucleotides and encoded 
polypeptides have a specific, substantial and credible utility. As such, reconsideration and 
withdrawal of the rejection under 35 U.S.C. § 1 12, first paragraph, is respectfully requested. 
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VIII. CONCLUSION 

For at least the reasons discussed above, Appellaits respectfully submit that claims 3, 4, 
6, 7, and 142- 144 meet the requirements of 35 U.S.C. sections 101 md 112. Accordingly, 
Appellants respectfully request that the rejections be reversed in whole, and that the claims be 

allowed to issue. 



Respectfully submitted. 



Date: March 26. 2010 



By /Michele M. Simkin/ 



FOLEY & LARDNER LLP 
Customer Number: 22428 
Telephone: (202) 672-5538 
Facsimile: (202) 672-5399 



Michele M. Simkin 
Attorney for Appellant 
Registration No. 34,717 
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CLAIMS APPENDIX 

(Withdrawn) An isolated polypeptide selected from the group consisting of: 

a) a polypeptide comprising an amino acid sequence selected from the group 
consisting of SEQ ID NO:l-43, 

b) a polypeptide comprising a naturally occurring amino acid sequence at least 90% 
identical to an amino acid sequence selected from the group consisting of SEQ ID 
NO:l, SEQ ID NO:22-23, SEQ ID NO:28, SEQ ID NO:30-32, SEQ ID NO:36-41 
and SEQ ID NO:43, 

c) a polypeptide comprising a naturally occurring amino acid sequence at least 9 1 % 
identical to the amino acid sequence of SEQ ID NO:5, 

d) a polypeptide comprising a naturally occurring amino acid sequence at least 93% 
identical to the amino acid sequence of SEQ ID NO: 27, 

e) a polypeptide comprising a naturally occurring amino acid sequence at least 94% 
identical to an amino acid sequence selected from the group consisting of SEQ ID 
NO:35 and SEQ ID NO:29, 
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f) a polypeptide comprising a naturally occurring amino acid sequence at least 95% 
identical to an amino acid sequence selected from the group consisting of SEQ ID 
NO:4, SEQ ID NO:l 1, and SEQ ID NO:20, 

g) a polypeptide comprising a naturally occurring amino acid sequence at least 96% 
identical to the amino acid sequence of SEQ ID NO:9 and SEQ ID NO: 18, 

h) a polypeptide comprising a naturally occurring amino acid sequence at least 97% 
identical to the amino acid sequence selected from the group consisting of SEQ ID 
NO:26, SEQ ID NO:33, and SEQ ID NO:34, 

i) a polypeptide comprising a naturally occurring amino acid sequence at least 98% 
identical to an amino acid sequence selected from the group consisting of SEQ ID 
N0:6 7, 

j) a polypeptide comprising a naturally occurring amino acid sequence at least 99% 
identical to the amino acid sequence of SEQ ID NO: 16, 

k) a polypeptide consisting essentially of a naturally occurring amino acid sequence 
at least 90% identical to an amino acid sequence selected from the group 
consisting of SEQ ID NO:2-3, SEQ ID NO:8, SEQ ID NO:10, SEQ ID NO:12-15, 
SEQ ID NO:17, SEQ ID NO:19, SEQ ID N0:21, and SEQ ID NO:42, 
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1) a biologically active fragment of a polypeptide having an amino acid sequence 
selected from the group consisting of SEQ ID NO: 1-43, and 

m) an immunogenic fragment of a polypeptide having an amino acid sequence 
selected from the group consisting of SEQ ID NO: 1-43. 

2. (Withdrawn) An isolated polypeptide of claim 1 comprising an amino acid sequence 
selected from the group consisting of SEQ ID NO: 1-43. 

3. (Previously Presented) An isolated polynucleotide encoding a polypeptide, wherein the 
polypeptide consists of the amino acid sequence of SEQ ID NO: 13. 

4. (Previously Presented) The isolated polynucleotide of claim 3 wherein the polynucleotide 
sequence consists of SEQ ID NO: 56. 

5. (Canceled). 

6. (Previously Presented) A recombinant polynucleotide comprising a promoter sequence 
operably linked to the polynucleotide of claim 3. 

7. (Previously Presented) A cell transformed with the recombinant polynucleotide of 
claim 6. 

8. -10. (Canceled). 

1 1 . (Withdrawn) An isolated antibody which specifically binds to a polypeptide of claim 1 . 
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12.-13. (Canceled). 

14. (Withdrawn) A method of detecting a target polynucleotide in a sample, said target 
polynucleotide comprising the polynucleotide of claim 3, the method comprising: 

a) hybridizing the sample with a probe comprising at least 20 contiguous nucleotides 
comprising a sequence complementary to said target polynucleotide in the sample, 
and which probe specifically hybridizes to said target polynucleotide, under 
conditions whereby a hybridization complex is formed between said probe and 
said target polynucleotide or fragments thereof, and 

b) detecting the presence or absence of said hybridization complex, and, optionally, 
if present, the amount thereof. 

1 5. (Withdrawn) A method of claim 14, wherein the probe comprises at least 60 contiguous 
nucleotides. 

1 6. (Withdrawn) A method of detecting a target polynucleotide in a sample, said target 
polynucleotide comprising the polynucleotide of claim 3, the method comprising: 

a) amplifying said target polynucleotide or fragment thereof using polymerase chain 
reaction amplification, and 

b) detecting the presence or absence of said amplified target polynucleotide or 
fragment thereof, and, optionally, if present, the amount thereof 
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17. (Withdrawn) A composition comprising a polypeptide of claim 1 and a pharmaceutically 
acceptable excipient. 

18. (Withdrawn) A composition of claim 17, wherein the polypeptide comprises an amino 
acid sequence selected from the group consisting of SEQ ID NO: 1-43. 

1 9. (Withdrawn) A method for treating a disease or condition associated with decreased 
expression of functional KPP, comprising administering to a patient in need of such treatment the 
composition of claim 17. 

20. (Withdrawn) A method of screening a compound for effectiveness as an agonist of a 
polypeptide of claim 1 , the method comprising: 

a) contacting a sample comprising a polypeptide of claim 1 with a compound, and 

b) detecting agonist activity in the sample. 

21. -22. (Canceled). 

23. (Withdrawn) A method of screening a compound for effectiveness as an antagonist of a 
polypeptide of claim 1, the method comprising: 

a) contacting a sample comprising a polypeptide of claim 1 with a compound, and 

b) detecting antagonist activity in the sample. 
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24.-25. (Canceled). 

26. (Withdrawn) A method of screening for a compound that specifically binds to the 
polypeptide of claim 1, the method comprising: 

a) combining the polypeptide of claim 1 with at least one test compound under 
suitable conditions, and 

b) detecting binding of the polypeptide of claim 1 to the test compound, thereby 
identifying a compound that specifically binds to the polypeptide of claim 1 . 

27. (Withdrawn) A method of screening for a compound that modulates the activity of the 
polypeptide of claim 1, the method comprising: 

a) combining the polypeptide of claim 1 with at least one test compound under 
conditions permissive for the activity of the polypeptide of claim 1, 

b) assessing the activity of the polypeptide of claim 1 in the presence of the test 
compound, and 

c) comparing the activity of the polypeptide of claim 1 in the presence of the test 
compound with the activity of the polypeptide of claim 1 in the absence of the test 
compound, wherein a change in the activity of the polypeptide of claim 1 in the 
presence of the test compound is indicative of a compound that modulates the 
activity of the polypeptide of claim 1 . 
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28. (Withdrawn) A method of screening a compound for effectiveness in altering expression 
of a tm-get polynucleotide, wherein said target polynucleotide comprises the polynucleotide 
sequence of claim 3, the method comprising: 

a) contacting a sample comprising the target polynucleotide with a compound, under 
conditions suitable for the expression of the target polynucleotide, 

b) detecting altered expression of the target polynucleotide, and 

c) comparing the expression of the target polynucleotide in the presence of varying 
amounts of the compound and in the absence of the compound. 

29. (Withdrawn) A method of screening for potential toxicity of a test compound, the 
method comprising: 

a) treating a biological sample containing nucleic acids with the test compound, 

b) hybridizing the nucleic acids of the treated biological sample with a probe 
comprising at least 20 contiguous nucleotides of the polynucleotide of claim 3 
under conditions whereby a specific hybridization complex is formed between 
said probe and a target polynucleotide in the biological sample, said target 
polynucleotide comprising the polynucleotide of claim 3, 

c) quantifying the amount of hybridization complex, and 
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d) comparing the amount of hybridization complex in the treated biological sample 
with the amount of hybridization complex in an untreated biological sample, 
wherein a difference in the amount of hybridization complex in the treated 
biological sample indicates potential toxicity of the test compound. 

30. (Withdrawn) A method for a diagnostic test for a condition or disease associated with the 
expression of KPP in a biological sample, the method comprising: 

a) combining the biological sample with an antibody of claim 11, under conditions 
suitable for the antibody to bind the polypeptide and form an antibody:polypeptide 
complex, and 

b) detecting the complex, wherein the presence of the complex correlates with the 
presence of the polypeptide in the biological sample. 

3 1 . (Withdrawn) The antibody of claim 1 1 , wherein the antibody is: 

a) a chimeric antibody, 

b) a single chain antibody, 

c) a Fab fragment, 

d) a F(ab')2 fragment, or 

e) a humanized antibody. 
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32. (Withdrawn) A composition comprising an antibody of claim 1 1 and an acceptable 
excipient. 

33. (Canceled). 

34. (Withdrawn) A composition of claim 32, further comprising a label. 

35. (Canceled). 

36. (Withdrawn) A method of preparing a polyclonal antibody with the specificity of the 
antibody of claim 1 1 , the method comprising: 

a) immunizing an animal with a polypeptide consisting of an amino acid sequence 
selected from the group consisting of SEQ ID NO: 1-43, or an immunogenic 
fragment thereof, under conditions to elicit an antibody response, 

b) isolating antibodies from the animal, and 

c) screening the isolated antibodies with the polypeptide, thereby identifying a 
polyclonal antibody which specifically binds to a polypeptide comprising an 
amino acid sequence selected from the group consisting of SEQ ID NO: 1-43. 

37. -43. (Canceled). 

44. (Withdrawn) A method of detecting a polypeptide comprising an amino acid sequence 
selected from the group consisting of SEQ ID NO: 1-43 in a sample, the method comprising: 
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a) incubating the antibody of claim 1 1 with the sample under conditions to allow 
specific binding of the antibody and the polypeptide, and 

b) detecting specific binding, wherein specific binding indicates the presence of a 
polypeptide comprising an amino acid sequence selected from the group 
consisting of SEQ ID NO: 1-43 in the sample. 

45. (Withdrawn) A method of purifying a polypeptide comprising an amino acid sequence 
selected from the group consisting of SEQ ID NO: 1-43 from a sample, the method comprising: 

a) incubating the antibody of claim 1 1 with the sample under conditions to allow 
specific binding of the antibody and the polypeptide, and 

b) separating the antibody from the sample and obtaining the purified polypeptide 
comprising an amino acid sequence selected from the group consisting of SEQ ID 
NO:l-43. 

46. (Withdrawn) A microarray wherein at least one element of the microarray is a 
polynucleotide of claim 13. 

47. (Withdrawn) A method of generating an expression profile of a sample which contains 
polynucleotides, the method comprising: 

a) labeling the polynucleotides of the sample, 
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b) contacting the elements of the microarray of claim 46 with the labeled 
polynucleotides of the sample under conditions suitable for the formation of a 
hybridization complex, and 

c) quantifying the expression of the polynucleotides in the sample. 

48. (Withdrawn) An array comprising different nucleotide molecules affixed in distinct 
physical locations on a solid substrate, wherein at least one of said nucleotide molecules 
comprises a first oligonucleotide or polynucleotide sequence specifically hybridizable with at 
least 30 contiguous nucleotides of a target polynucleotide, and wherein said target polynucleotide 
is a polynucleotide of claim 12. 

49. (Withdrawn) An array of claim 48, wherein said first oligonucleotide or polynucleotide 
sequence is completely complementary to at least 30 contiguous nucleotides of said target 
polynucleotide. 

50. (Withdrawn) An array of claim 48, wherein said first oligonucleotide or polynucleotide 
sequence is completely complementary to at least 60 contiguous nucleotides of said target 
polynucleotide. 

5 1 . (Withdrawn) An array of claim 48, wherein said first oligonucleotide or polynucleotide 
sequence is completely complementary to said target polynucleotide. 

52. (Withdrawn) An array of claim 48, which is a microarray. 
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53. (Withdrawn) An array of claim 48, further comprising said target polynucleotide 
hybridized to a nucleotide molecule comprising said first oligonucleotide or polynucleotide 
sequence. 

54. (Withdrawn) An array of claim 48, wherein a linker joins at least one of said nucleotide 
molecules to said solid substrate. 

55. (Withdrawn) An array of claim 48, wherein each distinct physical location on the 
substrate contains multiple nucleotide molecules, and the multiple nucleotide molecules at any 
single distinct physical location have the same sequence, and each distinct physical location on 
the substrate contains nucleotide molecules having a sequence which differs from the sequence 
of nucleotide molecules at another distinct physical location on the substrate. 

56. -141. (Canceled). 

142. (Previously Presented) An isolated polynucleotide complementary to the polynucleotide 
of claim 3. 

1 43 . (Previously Presented) An isolated polynucleotide complementary to the polynucleotide 
of claim 4. 

144. (Previously Presented) An RNA equivalent of the polynucleotide of claim 3. 
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X. EVIDENCE APPENDIX 
Exhibit A: 

Parutti et aL, Molecular Cloning, Functional Expression and mRNA Analysis of Human Beta- 
Adrenergic Receptor Kinase 2, Biochemical and Biophysical Research Communications , Vol. 
190(2): 475-481 (1993). 

The Parutti et al. reference was incorporated into the specification by reference {see Table 2 at 
page 121; Specification at page 44, lines 9- 1 0). With respect to incorporation by reference, 
"[t]he information incorporated is as much a part of the application as filed as if the text was 
repeated in the application, and should be treated as part of the text of the application as filed." 
(MPEP § 2163.07(b)). Accordingly, Parutti et al, is inherently part of the record. 

Exhibit B: 

Yue et aL, An Evaluation of the Performance of cDNA Microarrays for Detecting Changes in 
Global mRNA Expression, Nucleic Acids Research . Vol. 29(8): e41 (2001). 

The Yue et al. reference was submitted with the reply dated January 13, 2010, and entered into 
the record on January 14, 2010. This is confirmed in the Advisory action dated January 26, 
2010. 

Exhibit C: 

Lee et aL, Gene Expression Profile of Aging and its Retardation by Caloric Restriction, Science , 
Vol. 285(5432):1390-93 (1999). 
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The Lee et al. reference was submitted with the reply dated January 1 3, 201 0, and entered into 
the record on January 14, 2010. This is confirmed in the Advisory action dated January 26, 
2010. 

Exhibit D: 

Vasseur et al.. Gene Expression Profiling by DNA Microarray Analysis in Mouse Embryonic 
Fibroblast transformed by ras^'^ Mutated Protein and the EIA Oncogene, Molecular Cancer . 
Vol. 2(19), (2003). 

The Vasseur et al. reference was submitted with the reply dated January 13, 2010, and entered 
into the record on January 14, 2010, This is confirmed in the Advisory action dated January 26, 
2010. 

Exhibit E: 

Tsien et al.. On Reporting Fold Differences, Pacific Symposium on Biocomputing. 6:496-507 
(2001). 

The Tsien et al. reference was submitted with the reply dated January 13, 2010, and entered into 

the record on January 14, 2010. This is confirmed in the Advisory action dated January 26, 
2010. 
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EXHIBIT A - Parruti et al , 



Vol. 190, No. 2^ 1993 BIOCHEMICAL AND BIOPHYSICAL RESEARCH COMMUNICATIONS 

January 29, 1993 Pages 475-481 

This material may be protected k>y Copyright law (Title 17 U.S. Code) 



MOLECULAR CLONING. FUNCTIONAL EXPRESSION AND ITIRNA ANALYSIS OF 
HUMAN BETA-ADRENERGIC RECEPTOR KINASE 2 



Glustino Parruti, Grazia AmbrosinI, Michele Sallese, and Antonio De Blasi' 



Consorzio Mario Negri Sud, 
Isthuto di Ricerche Farmacologiche "Mario Negri", 
Santa Maria Imbaro, Italy 



.Received December 1, 1992 



In the present study the cDNA of human BARK2 was cloned using both PGR 
and cDNA library screening, subcloned into an expression vector and transiently 
expressed in COS7 cells. The expressed kinase activity was -40% as efficient as 
human BARK1 in phosphorylating bovine rod outer segments in vitro . Northern blot 
analysis of human and bovine mRNA revealed a species-specific pattern of multiple 
hybridization bands, with two major transcripts in human rather than one in bovine. 
High levels of mRNA expression were found in peripheral blood leukocytes. « i9»3 

AGAdMlc Prass, Inc. 



For a number of receptors a rapid loss of responsiveness has been shown to 
occur upon exposure to agonists (1). This phenomenon is known as homologous 
desensitization, and has been best characterized on the model of the B2-adrenergic 
receptor (BAR) (1). Receptor phosphorylation is an eariy step in the process of 
desensittzatlon of these receptors. A selective kinase (called B-adrenergic receptor 
kinase, BARK) has been identififid rvblcb phosphorylates .ihe asoolsl-oncupxed 
form of the receptor, as demonstrated La.yilro, for BAR (2). Other G-coupled 
receptors, including the a2-adrenergic receptor (3), muscarinic xiliQljnergic 
receptors (4) and rhodopsin (5), can be phosphorylated in yilro by BARK in an 
agonist-dependent manner. BAR agonist Isoproterenol induces translocation of 
BARK from cytosol to membranes upon stimulation of BAR (1,6). This represents the 



iTo whom con^espondence should be addressed at the Consorzio Mario Negri 
Sud, via Nazionale, 66030 S. Maria Imbaro, Italy. Fax: 39.872.578.24t). 

Abbreviations: BARK, beta-adrenergic receptor kinase; BAR, beta-adrenergic 
receptors: PAP, platelet activating factor? F, fonA^ard and R, reverse primers; PGR, 
polymerase chain reaction; PBL, peripheral blood leukocytes (granuIocytes+ 
lymphocytes+monocytes); MNL, mononuclear leukocytes (rymphocytes+ 
monocytes); ROS. rod outer segments. 
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first step of BARK activation, A role for Bysubunits of G-proteins has been recently 
demonstrated in the process of BARK translocation and activation (7). Two subtypes 
of BARK have been identified by molecular cloning In the bovine, called BARK1 and 
BARK2 (8). The highest levels of specific mRNA were found in the central nervous 
system and in highly innervated tissues, suggesting that the BARK-mediated 
mechanism of receptor desensitization may be primarily active on synaptic 
receptors (1,2,8)- We have recentiy cloned the cDNA for human BARK1 and shown 
that peripheral blood leulcocytes (PBL) represent a major site of expression (6). B- 
adrenergic agonist isoprorerenol and platelet activating factor (PAF), which in these 
cells act as potent immunomodulators, were able to induce BARK translocation (6). 
We suggested a role of BARK in the regulation of receptor-mediated immune 
functions (6). 

We present here the molecular cloning and functional expression of human 
BARK2. mRNA tissue distribution of human BARK2 was also investigated. High 
levels of specific mRNA are present in PBL, further supporting a possible role for 
these kinases in immunological settings. 

MATERIALS AND METHODS 

Tissue and cell sources - Cultured cells (American Tissue Culture 
Collection), growing under standard conditions, were harvested directly in 
guanidine isothiocyanate (BRL), Four different human cell lines were studied: two 
hepatomas (Hep G2 and SK-HEP-1), one lung carcinoma (A549), one 
neuroblastoma .(IMR'32). Human umbilical vein endothelial cells (HEC) were 
isolated from umbilical cord vein and analyzed at the 4<i»- 6* passage. Human and 
bovine PBL and mononuclear leukocytes (MNL) were fractionated as previously 
described (6). Bovine tissues were frozen in liquid nitrogen after collection in the 
local slaughterhouse and total RNA prepared as described (6). 

PCR cloning - The general PCR clpning strategy was as described in 
detail in ref.6. Forward (F) and reverse (R) primers indicated below are positioned 
on sequences numbered starting at the beginning of the coding region, i.e. base 1 
is the A of ATG. When this study was planned, the cloning of bovine BARK2 had not 
been yet reported. Based on the sequence of bovine BARK1 (2) we designed 
oligonucleotides F3 (bp 586-608 of the coding region as in ref.2} and R3 (bp 1092- 
1 069, rei2) to amplify human BARK subtypes. We obtained a PCR product which 
turned out to con-espond to bp 586-1069 of human BARK2 cDNA. To clone the 
entire coding region, we used primers Fl, R1 and R2 (bp 1-19, 2064-2048 and 
1566-1546 respectively of the coding sequence of bovine BARK2, ref.8), and 
primers F2 (bp 291-310), F4 (bp 913-932), F5 (bp 1300-1320), F6 (bp 1407-1428). 
R4 (bp 543-524) and R5 (bp 252-232) designed on the human sequence obtained 
from previously sequenced PCR products or library denes. The cDNA fragments 
cloned are: F1-R5, F2-R4, F3-R3, F4-R2 and F5-R1 and F6-R1. To obtain the first 
cDNA strand, l^ig total RNA from human adipose tissue or MNL was reverse- 
transcribed using random hexamers (Pharmacia LKB) and M-MLV reverse 
transcriptase (BRL). Amplifications were carried out as previously described (6). 
PCR products were subcloned blunt-end in PTZ18R or Bluescript and sequenced in 
both directions with 17 DNA polymerase (Pharmacia LKB). 

cDNA library screening - A human pituitary cDNA library in lambda 
bluemid (cloning site Eco Rl and Hind III; Clontech, Palo Alto, CA), containing 
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1,5X10^ recombinants, was screened. The 609-1068 cDNA fragment of human 
BARK2, labelled by random priming (Amersham International kit), was used as a 
probe. The screening was done under high stringency conditions. 

Expression of BARK2 in COS7 cells and phosphorylation assays - 
The 2.1 kb full lenght cDNA of SARK2 was subdoned into the eukaryotic expression 
vector pBJI-neo (a kind gift of Dr. S. Albert!, 1st. Mario Negri), using restriction 
endonucleases from BRL. C0S7 cells, grown to -70% confluence In 80 mm plates, 
were transfected by the DEAE dextran procedure, using 10 jxg of plasmid DNA for 
each plate. Cells were harvested 72 hours after transfection and cytosolic proteins 
prepared as previously described (9). Bovine rod outer segments (ROS) were 
prepared from bovine retina by stepwise suaose gradient sedimentation. Retinal 
rhodopsin kinase was degraded by treatment with 5M urea. Phosphorylation 
reactions were carried out as described (9), using 3-5 jig of cytosolic proteins 
prepared from transfected cells, 300 pmol urea-treated ROS and 65 \iM [y-^^pjatp 
(0.5-5 cpm/fmol) in a final volume of 40 Following electrophoresis in 
polyacrylamide gels, for quantitative measurement of BARK activity rhodopsin 
bands (mw -35Kd). identified by Coomassie Blue staining, were cut and counted 
for32p radioactivity. 

Northern blot analysis - Total RNA was isolated by the guanidine 
Isothiocyanate/cesium chloride method as described (6). Total RNA (20 ^g) was 
fractionated on a 1% agarose-formaldehyde gel and transferred to a Gene Screen 
Plus membrane (New England Nuclear, Boston, MA). The RNA blot was hybridized 
with a random primed radioactive cDNA fragment (bp 609-1068) of human BARK2 
in 50% formamide. 10% dextran sulphate, 1% SDS, 5.8% NaCI, and denatured 
salmon sperm DNA (100 \ig/m\) for 24h at 42°C. Blots were washed in 2X SSC-1% 
SDS at 60**C (low stringency) or in 0.2X SSC-0.1% SDS at 65^C (high stringency) 
and subjected to autoradiography for 1-5 days at -SO'^C. All the results were 
confirmed on RNA from at least two different preparations. 



RESULTS AND DISCUSSION 

The cDNA of human BARK2 was cloned by PGR and cDNA library screening. 
TTie first PGR clone obtained (bp 609-1068) was amplified with non-degenerate 
primers designed on sequences in the catalitic domain of bovine BARKi (2). This 
cDNA fragment was used to screen a human pituitary cDNA library. Two partial and 
partly overiapping clones provided the sequence from bp 149 to 1 1 15 of the coding 
region. The cloning was then completed by PGR. cDNA* and amino acid 
sequences of human (3ARK2 are 72% and 84% identical to the corresponding 
sequences of human BARK1 , in keeping with what reported for bovine subtypes (8). 
In addition, the amino add sequence of human BARK2 is 95% identical to that of 
bovine BARK2, a very high level of interspecies conservation. Aminoacid 
sequences of human and bovine BARK1 and BARK2 are aligned in Flg.1. In most 
cases of interspecies aminoacid variation for both BARK1 and BARK2, one 
aminoacid is the same as that conserved In the other sublype, consistgnUfidth the 
idea that all four genes are derived from a common ancestor. The highest levels of 
conservation are found In the catalftic domain of the kinases (aa 198-436). and in 

2The nucleotide sequence of human BARK2 cDNA has been submitted to the 
GenBank/EMBL Data Bank with accession number X691 17. 
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____J HADLEAVLAD VSYLMAMEKS KATPAARASK RIVLPEPSIR SVMQKYLAER NEITFDKIFN QKIGFLLrKD 70 
T^ARK? HADLEAVLAD VSYLMAMEKS KATPAARASK KIVLPEPSIR SVMQKYLEER HEITFDKIFN QRICFLLFKD 
HADLEAVLAD VSYLMAMEKS KATPAARASK KILLPEPSIR SVHQKYLEDR GeVTFEKIFS QKLGYLLFRD 
HADLEAVLAD VSYLMAMEKS KATPAARASK KILLPEPSIR SVHQKYLEDR GBVTFEXIFS QKLGYLLFIU) 
M******** »••••♦♦•♦* * ******* ♦ • * * * 

FCLNEINBAV PgVKFYEEIK EYEKLDNEED RLCRSRQIYD AYIMKELLSC SBPFSKQAVE BVQSBLSKKQ 110 

FCLNEIHBAV POVKFYEEIK EYEKLDNEED RLCRSRQIYD .AYIHKELLSC SflPFSKQAVE BVQSHLSKKQ 

FCLHBLEEAR PLVEFYBEIK KYEKLETEEE RVARSREIFD SYIHKELLAC SBPFSKSATB BVQGBLGIOCQ 

rcUCBLEEAK PLVEFYEEIK KYEKLETEEE RLVCSRBIFD TYIMXELLAC SflPFSKSAIE BVQ6BLVKKQ 



tt***-** • ****** * • *** ** *** 



VTSTLFQPYl EEICESLRCD IFQKPHESDK FTRFCQHKNV ELNIHLTMSE FSVHRIIGRC GFGEVYCCRK 210 
VTSTLFQPVI EEICESLRCS irQKFMESDK FTRFCQHKHV ELNIHLTMND FSVHRIIGRG CFGEVYGCRK 
VPPDLFQPYI EEICQNLRCD VFQKFIESDK FTRFCQHKNV ELNIHLTMSD FSVHRIIGRG GFGEVYGCRK 
VPPDLFQPYI EEICQMLRGD VFQKFIESDK FTRFCQHKNV ELNIHLTMND FSVBRIIGRC GFGEVYGCRK 
^ »*•♦*• **** ♦** *♦** ♦♦*♦ ********** ********* ********** ********** 

ADTGKMYAMK CLDKKRIKMK QGETLALNER IMLSLVSTGD CPFIVCMTYA FHTPDKLCFI LDLMSGGDLH 280 

ADTGKMYAHK CLDKKRIKMK QGETLALNER IMLSLVSTGD CPFIVCMTYA FHTPDKLCFI LDLMMGGDLH 

ADTGKMYAMK CLDKKRIKMK QGETLALNER IMLSLVSTGD CPFIVCHSYA FBTPDKLSFl LDLMMGGDLH 

ADTGKMYAMK CLDKKRIKMK QGETLALNER IMLSLVSTGD CPFIVCHSYA FHTPDKLSFl LDLMNGGDLH 

■♦*♦****♦** ********** ********** ********** ******* ** ******* *♦ ********** 

YHLSQHGVFS EKEMRFYATB lIWLEHViW RFWYRDLKP AHILLDEHGB ARISDLGLAC DFSKKKPBAS 350 
YHLSQH6VFS EKEMRFYATE IILGLEHHHH RFWYRDLKP AHILLDEHGB VRISDLGIAC DFSKKKPBAS 
YHLSQHGVFS EADHRTYAAE IILSLEHHWJ RFWYRDLKP AHILLDEHGB VRISDLGLAC DFSKKKPBAS 
YHLSQHGVFS EADMRFYAAE III^LEHHBH RFWYRDLKP AHILLDEHGH VRISDLGLAC DFSKXXPHAS 

VCTHGYMAPE VLQKGTAYDS SADHFSLGCH LFKLLR6HSP FRQHKTKDKH EIDRMTLTVN VELFDTFSPE 120 
VGTBGYHAPS VLQKGTAYDS SADHFSLGCH LFKLLRGBSP FRQHKTKDKH EIDRMTLTHH VELPDVFSPE 
VGTHGYHAPE VLQKCVAYDS SADHFSLGCM LFKLLRGHSP FRQHKTKDKH EIDRMTLTMA VELPDSFSPE 
VCTHGYMAPE VLQKSVAYDS SADHFSLGCM LFKLLRGHSP FRQHKTKDKH EIDRMTLTMA VELPDSFSPE 

LKSLLEGLLQ RDVSKRLCCH GGGSQEVKEH SFFKGVDWQH VYLQKYPPPL IPPRGEVNAA DAFDIGSFDE 490 
LKSLLEGLLQ RDVSKRLCCH GGSAQELKTH DFFRGIDWQH VYLQKYPPPL IPPRGEVNAA DAFDIGSFDE 
LKSLLEGLLQ RDVSKRLCCH GGSAQELKTH DFFRGIDWQH VYLQKYPPPL IPPRGEVNAA DAFDIGSFDE 
LR5LLEGLLQ RDVNRRLQCL GRGAQEVKES PFFRSLDHGH VFLQXYPPPL IPPRGEVNAA DAFDIGSFDE 
* ******** *** ***« * ** * *• *** * ******** ********** ********** 

EDTKGIKLLD CDQELYKNFP LVISERWQQE VAETVYEAVN ADTDKIEARK RAKNKQLGHE EDYALGKOCI 560 

EDTKGIKLLD CDQELYKNFP LVISERWQQE VAETVYEAVN ADTDKIEARK AAKNKQLGBE EDYALGRDCI 

EDTKGIKLLD SDQELYRNFP LTISERHQQE VAETVFDTIN AETDRLEARX KAKNKQLGHE EDYALGKDCI 

EDTKGIKLLD SDQELYRNFP LTISCRHQV VAETVFDTIN AETDRLEARK KTKNKQLGHE. EDYALGKDCI 

********** ***** *••* * ******** ***** *•• •* <r*** •***•**• ****** *** 

HHCYMUOiGN PFLtQHQRRY FYLFPNRLEH RGEGESXQNL LTHEQILSVE BTQIXDKKCI LFRIKGGKQF C30 

vbgyhlklgn pfltqhqrry fylfpnrleh rgegesrqsl ltmeqivsve etqixdkxci llrikggxqf 
mhgymskmgh pfltonqbry fylfphrler rgegeapqsl ltmeexqsve etqikerkcl lucirggxqf 
hhgymsxhsn pfltqhqrry fylfpnrleh rgegeapqsl ltmeeiqsve etqikerkcl llxirggxqf 



VLQCESDPEF VQWKKELNET FKEAQRLLRR APKFLNKPRS GTVELPKPSL CHRNS-NGL 688 
VLQCESDPEF VQWKKELTET FMEAQRLLRR APKFLNKSRS AWELSKPPL CHRNS-NGL 
ILQCDSDPEL vqhkkelrda YREAQQLVQR VPKMKNKPRS PWELSKVPL VQRGSANGL 
VLQCDSDPEL VQHKKELRDA YREAQQLVQR VPKMKNKPRS PWELSKVPL IQRGSANGL 
*** **** *#*•••* *** ****** ** **• * * * * *** 

Figure 1 . Alignment of the amino acid sequences of human RARK2. bovine 
SARK2, human BARKl and bovine BA.RK2, Human BARKl is from ref.6. bovine 
BARK1 from ref.2, bovine BARK2 from ref.8. Asterisks indicate amino adds that are 
conserved among all four genes. 
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the first 47 aminoacids, that are identical in the 4 sequences except for two 
conservative substitutions. Our sequence data support the view that aminoacid 
substitutions are clustered (8), as most of them fall in three regions, i.e. the central 
portion of the amino-termlnal domain and the initial and final portions of the 
carboxyl-termlnal domain. As no data are presently available as to which regions 
determine substrate specificity In these kinases, it is likely that aminoacid stretches 
where differences are clustered be involved in substrate recognition and binding. 
The carboxy-terminal region of these proteins is known to be involved in binding of 
&y suburiits of G proteins (7). This region appears as the most variable, suggesting 
that different subtypes of BARK may specifically interact with different subtypes of By 
subunits. 

The full-lenght cDNA of human BARK2 was subcloned into the expression 
vector pBJI*neo (called pBJi-BARK2). Parallel transfections were done with the 
expression vector alone (pBJi-neo) and with the full-length cDNA for human 
BARK1, inserted in the same expression vector (pBJI-BARK1 ) (9). BARK activity was 
assayed in vitro by ROS phosphorylation assays. When compared with COS7 cells 
transfected with pBJI-neo, a 10 to 15 fold Increase in BARK activity was obtained in 
cdS7 cells transfected with pBJI-BARK2, whereas a 25 to 40 fold increase in BARK 
activity was revealed in cells transfected with pBJl-BARK1 (Fig.2). As expected, 
kinase activity was inhibited by heparin (Fig.2). Human BARK2 phosphorylaled 
bovine ROS less efficiently than human BARK1 expressed in parallel experiments, 
that is with -40% efficiency of human BARK1. This was in reasonable agreement 
with a previous report on bovine BARK2, which phosphorylated bovine ROS -20% 
as efficiently as bovine BARK1 (8). Our expression system provides a simple and 
powerful tool for the screening of inhibitors or modulators of both human kinases, 
makirig use of the same assay, based on phosphorylation of ROS. 

Northern blot analysis of human BARK2 revealed two major hybridization 
species present in all human cells tested, appearing as a doublet at -8 and ~7 kb 
respectively, whereas a single major hybridization band was detected on bovine 
RNAs at -7 kb (Fig.3 and ref.8). The finding of two major mRNA species in human 
Instead of one In bovine is unexpected and peculiar, as this difference in mRNA 
expression distinguishes two otherwise highly conserved genes. The relative 
abundance of the two human hybridization species varied significantly, with a 
prevalence of the longer transcript in IMR32, a cell line from human neuroblastoma, 
and neariy equal iritensitles for the two transcripts in monocytes and granulocytes 
(Fig.3, left). In many cases, additional hybridization bands were indvealed both in 
human and bovine. These bands, sized -3.5 and 2 kb in human and -'3 kb in 
bovine, were detected even after high stringency washings. Multiple transcripts 
from a single gene and analogous interspecies differences in the hybridization 
pattern have already been reported for two subtypes of protein kinase C (10). A 
likely explanation for both phoenomena may be alternative mRNA processing of the 
lenghty non-coding sequences present in these mRNAs. 
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Rgure 2. In vitrp rhodopsin phosphorylation by transiently expressed BARK2, 
BARK activity was assayed in G0S7 cells transfected with pBJI-neo, with pBJl- 
6ARK1 and with' pBJNBARK2. Inhibition of BARK2 activity by heparin (10 ng/ml, 
Sigma H-3125) is also shown (PBJI-BARK2 + Hep). The dried gel was exposed for 
18 hours at The position of molecular mass standards expressed In 

kllodaitons and that of rhodopsin bands (ROS) are indicated. 
Rgure 3. Northern blot analysis of human and bovine mRNA from various 
ti^ues and ceil types. Two separate blots are shown, left, total RNA (20 ng) from 
human monocytes (MONO), granulocytes (PMN), HepQ2. SK^HEP-I (SK), 
endothelial cells (EC). 1MR32 and A549 cell lines was probed with a random 
primed 6ARK2 cDNA fragment (bp 609-1068). Washed filters were exposed at 
-80**C for 36-72h. right, total RNA (20 fig) from human monocytes (h MONO) and 
bovine leukocytes (PBL), brain, cerebellum (CEREB), heart and spleen was probed 
as above. Washed fitters were exposedy^t -WC for 36-96h. 

High levels of BARK2 mRNA were detected in human leukocytes compared 
to various human cell types (Rg.3). A detectable signal vvas found in most cell lines 
studied, IMR32 being the most expressive one (Flg.3, left). Analysis of BARK2 
mRNA in human tissues revealed moderate expression in lung, heart and adipose 
tissue (not shown). Comparison of mRNA expression in leukocytes and brain was 
performed on bovine material. Bovine leukocytes were slightly less expressive than 
bovine cerebellum, but more expressive than bovine brain (Fig.3, right). 

The finding of high levels of BARK2 mRNA in leukocytes is In line with other 
data obtained in our laboratory, showing that PBL represent a major site of 
expression for BARK (6). We found high levels of I3ARK1 mRNA as well as high 
levels of BARK activity in leukocytes (6). A functional role for I3ARK In immune cells 
was suggested by our finding of 8ARK translocation In MNL exposed to 8- 
adrenergic receptor agonist isoproterenol and PAF, which act as potent 
immunomodulators on these ceMs. Another cytosolic protein, B-arrastin, acts as a 
cofactor of 8ARK to Induce maximal desesitization of phosphoryiated receptors (11). 
Indeed, we also found high levels of expression of both known 8*anrestln subtypes 
in human MNL (G. Parruti and A. De Blasi, unpublished observation). Altogether 
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these data strongly support a functional role of the BARK/B-arrestin mechanism of 
receptor desensitizatlon In immune cells. Receptors for many immunomodulators 
present on PBU such as PAF receptors, belong to the superfamily of G-coupled 
receptors and share relevant homology In terms of sequence and structure with G- 
coupled receptors that are known to be regulated by the BARK/B-anrestin system. It 
is conceivable that these receptors also share the same molecular mechanisms of 
homologous desensitizatlon. However, the possibility that any of the BARK 
subtypes, highly expressed in PBU is actually involved in the regulation of any of 
these candidate receptors mediating immune functions remains to be investigated. 

In conclusion, with the cloning of the cDNA for human BARK2, sequence 
Information is now available for both known subtypes of BARK In human. This will 
help testing the possible involvement of these kinases in the regulation of 
candidate G-coupled receptors as well as in the pathogenesis of diseases for which 
an anomalous regulation of receptors has been reported. 
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ABSTRACT 

The cDNA microarray Is one technological approach 
that has the potential to accurately measure changes 
in global mRNA expression levels. We report an 
assessment of an optimized cDNA microarray platform 
to generate accurate, precise and reliable data 
consistent with the oblective of using microarrays as 
an acquisition platform to populate gene expression 
databases. The study design consisted of two Inde- 
pendent evaluations with 70 arrays from two different 
manufactured lots and used three human tissue 
sources as samples: placenta, brain and heart. 
Overall signal response was linear over three orders 
of magnitude and the sensitivity for any element was 
estimated to be 2 pg mRNA. The calculated coefficient 
of variation for differential expression for all non- 
differentiated elements was 12-14% across the entire 
signal range and did not vary with array batch or 
tissue source. The minimum detectable fold change 
for differential expression was 1.4. Accuracy, In 
terms of bias (observed minus expected differential 
expression ratio), was less than 1 part In 10 000 for ail 
non-differentiated elements. The results presented in 
this report demonstrate the reproducible performance 
of the cDNA microarray technology platform and the 
methods provide a useful framework for evaluating 
other technologies that monitor changes in global 
mRNA expression. 

INTRODUCTION 

The constniction of gene expression databases is a high 
priority of today's research community. Such databases, 
closely integrated with other types of genomic information, 
promise not only to facilitate our understanding of many 
fundamental biological processes, but also to accelerate drug 
discovery and lead to customized diagnosis and treatment of 
disease (1-^). 

These databases will require the development of one or more 
underlying supporting technologies that can accurately and 
reproducibly measure changes in global mRNA expression 



levels. The ideal technology should be able to process large 
numbers of samples, require minimal amounts of biological 
source material and be applicable across a wide range of cell or 
tissue types. Several different technologies are currently being 
investigated for their ability to meet these stringent require- 
ments (7-12). While many of these technologies show significant 
promise in preliminary studies, it is critically important that 
each technology be comprehensively evaluated as a complete 
system for producing accurate^ precise and reliable expression 
data (13,14). 

The Incyte cDNA microarray technology platform simulta- 
neously analyzes the relative expression levels of up to 10 000 
geiies, each of which iis present as a unique cDNA element (7). 
The platform is potentially scalable to include all of the 

elements in the human genome. PCR-derived elements 
averaging 1000 nt in length are physically arrayed in a two- 
dimensional grid on a chemically modified glass slide. Aliquots 
from two purified mRNA samples are separately reverse tran- 
scribed using primer sets labeled with two different fluoro- 
phores and the resulting dye-labeled cDNA populations are 
used to probe the target elements in a competitive hybridization 
reaction. After hybridization the glass slide is analyzed in a 
two-channel fluorescence scanner and the ratio between the 
two fluorophores detected for any given element defines the 
relative amount of the mRNA corresponding to that element 
present in the original two samples. 

There arc many process variables that will impact on the 
quality of the data generated by any microarray technology 
platform. In this report we describe parameters for the 
manufacture of effective cDNA microarrays with highly repro- 
ducible performance characteristics, the quality and quantity of 
sample mRNAs used to create the dye-labeled cDNA probe 
and the effects of these optimized procedures on the overall 
performance, accuracy, precision and reliability of expression 
data generated from the two-channel ratiometric approach. 

IMATERIALS AND IMETHODS 

Synthesis of PCR products 

PCR was used to generate large quantities of defined target 
DNA for microarray production. Plasmids containing cloned 
genes were grown in Escherichia coli and were amplified 
using vector primers SK536 (5'-GCGAAAGGGGGATGT- 
GCTG-BO and SK865 (S'-GCTCGTATGTTGTGTGGAA-S') 
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(Operon Technologies, Alameda, CA). Briefly, 1 |il of bacterial 
cell culture was added to 75 jil of reaction buffer, containing 
10 mM Tris-HCl pH 8.3, 1.5 mM MgClj, 50 mM KCl. 0.2 mM 
each dNTP, 0,5 \iM each primer and 2 U Taq polymerase. The 
mixture was incubated for 3 min at 95''C and 30 cycles of PGR 
were performed at 94'C for 30 s, 56**C for 30 s and 72''C for 
90 s. A final incubation for 5 min at 72°C was followed by 
reduction of the temperature to 4°C in order to terminate the 
reaction, PGR products were then purified by centrifugal 
chromatography with Sephadex S400 resin (Amersham- 
Pharmacia Biotech, Uppsala, Sweden) in a 96-well format. 
Briefly, 400 ^1 of S400 resin pre-equUibrated in 0.2x standard 
saline citrate buffer (SSC) was added to each well of a 96-wcll 
microtiter plate. A unique PGR product prepared as described 
above was loaded into each well and the plate was centrifuged 
in an Eppendorf 58 10 centrifuge at 885 rx.f, (relative centrifugal 
force). Purified PGR products were concentrated to dryness 
and resuspended in 10 jxl of H2O. DNA was resolubilized by 
thermal cycling (five cycles of 85**G for 30 s and 20°G for 30 s). 

Qualification and quantification of PGR products 

PGR products were routinely analyzed for quality by agarose 
gel electrophoresis and samples that failed to amplify or had 
multiple bands were annotated in the GEMTools database 
management software (Incyte Genomics, Fremont, GA). PGR 
products were quantified using PicoGreen dye (Molecular 
Probes, Eugene, OR) in a fluorescent assay specific for 
measuring double-stranded DNA concentration according to 
the manufacturer's instructions. 

Arraying and post-processing 

Ten thousand PGR products were arrayed by high speed 
robotics (7) on amino-modified glass slides (M .Reynolds, 
unpublished results). Each element occupied a spot of -150 |xm in 
diameter and spot centers were 170 ^m apart. DNA adhesion 
to the glass was achieved by irradiation in a Stratal inker Model 
2400 UV illuminator (Stratagene» San Diego, G A) with light at 
254 nm and an energy output of 120 000 ^J/cm^. To minimize 
any potential non-specific probe interactions with the glass the 
microarrays were washed for 2 min in 0.2% SDS (Life Tech- 
nologies, Rockville, MD), followed by three rinses in HjO for 
1 min each. The microarrays were treated with 0.2% (w/v) I-block 
(Tropix, Bedford, MA) in phosphate-buffered saline (PBS) for 
30 min at 60°G. They were washed again for 2 min in 0.2% 
SDS, rinsed three times in HjO for 1 min each and finally dried 
by a brief centrifugaiion. Dried microarrays were routinely 
stored in opaque plastic slide boxes at room temperature. 

Array qualiHcation: SYTO 61 dye 

As SYTO 61 nucleic acid staining has generally been applied 
to cells, the standard procedure was modified to allow its use 
for measurement of DNA bound to microarrays. A 5 \iM stock 
solution of SYTO 61 dye (Molecular Probes) in DMSO was 
diluted 1:100 in 10 mM Tris^HGl pH 7, 0.1 mM EDTA (TE). 
Several microarrays from each manufactured batch were 
immersed in this solution for 5 min at room temperature, rinsed 
with TE, rinsed with H2O and finally with absolute elhanol. 
After drying the microarrays were scanned on a GcnePix 
4000A scanner (Axon Instruments, Foster Gity, GA) at 535 nm. 



niRNA preparation and probe synthesis 

Briefly, raRNA was isolated by a single round of poly (A) 
selection using Oligotex resin (Qiagen, Valencia, GA) from 
commercially available human placenta, brain and heart total 
RNA (Biochain, San Leandro, GA). The purified mRNA was 
quantified using RiboGreen dye (Molecular Probes) in a 
fluorescent assay. RiboGreen dye was diluted 1 :200 (v/v final) 
and mixed with known RNA concentrations (determined by 
absorbance at 260 nm) ranging from 1 to 5000 ng/ml. A 
Millennium RNA size ladder (Ambion, Austin, TX) was used 
to generate standard curves and unknown samples were diluted 
as necessary. Fluorescence was measured in 96-well plates 
with a FLUOstar fluorometer (BMC Lab Technologies, 
Germany) fitted with 485 nm (excitation) and 520 nm (emission) 
filters. 

Between 25 and 100 ng mRNA were separated on an Agilent 
2100 Bioanalyzer, a high resolution electrophoresis system 
(Agilent Technologies, Palo Alto, GA), to examine the mRNA 
size distribution, 200 ng of purified mRNA were converted to 
either a Gy3- or Gy5-labeled cDNA probe using a custom 
labeling kit (Incyte Genomics). Each reaction contained 
50 mM Tris-HGl pH 8.3. 75 mM KGl. 15 mM MgQz, 4 mM 
DTT, 2 mM dNTPs (0.5 mM each), 2 ^ig Gy3 or Gy5 random 
9mer (Trilink, San Diego, GA), 20 U RNase inhibitor 
(Ambion), 200 U MMLV RNase H-free reverse transcriptase 
(Promega, Madison, WI) and mRNA. Gorrespondingly labeled 
Gy3 and Gy5 cDNA products were combined and purified on a 
size exclusion column, concentrated by ethanol precipitation 
and resuspended in hybridization buffer. 

Array qualification: complex and vector^spedflc 
hybridizations 

Hybridization of labeled cDNA probes was performed in 20 ^1 of 
5x SSC, 0.1% SDS, 1 mM DTT at eO^'G for 6 h. Hybridization 
with a Gy3-labeled vector-specific oligonucleotide (5'-TTGG- 
ACXnTGGGGTAATGATGGTCATACjGTGTTTGGTGTGT- 
GAAATTGTTATGGGGTGA-3') (Operon Technologies) was 
performed at 10 ng/^il in 5x SSG, 0.1% SDS, 1 mM DTT at 
60°G for 1 h. The microarrays were washed after hybridization 
in ix SSG, 0.1 % SDS, 1 mM DTT at45°G for 10 min and then 
in 0.1 X SSG, 0.2% SDS, 1 mM DTT at room temperature for 
3 min. After drying by centrifugation, microarrays were scanned 
with an Axon GenePix 4000A fluorescence reader and GenePix 
image acquisition software (Axon) at 535 nm for Gy3 and 625 nm 
for Gy5. An image analysis algorithm in GEMTools software 
(Incyte Genomics) was used to quantify signal and background 
intensity for each target element. The ratio of the two corrected 
signal intensities was calculated and used as the differential 
expression ratio (DE) for this specific gene in the two mRNA 
samples. 

The Axon scanner was calibrated using a primary standard 
and a secondary standard to account for the differences in 
scanner performance [laser and photomultiplier tube (PMT)] 
between the Gy3 and Gy5 channels. For the primary standard 
hundreds of probe samples were prepared that were fluores- 
cently balanced in the Gy3 and Gy5 channels as determined by 
a Fluorolog3 fluorescence spectrophotometer (Instruments 
SA. Edison, NJ). These probes were hybridized to microarrays 
and the scanner PMTs were adjusted to give bailanced fluores- 
cence and the greatest dynamic range. Using these PMT values 
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a fluorescent plastic slide was scanned to obtain corresponding 
fluorescent values. This secondary standard was used to 
calibrate scanners on a daily basis. 

Data acquisition and analysis 

Two low frequency data correction algorithms were applied to 
compensate for systematic variations in data quality. The first 
procedure, a gradient correction algorithm, modeled the signal 
response surfaces of each channel. On a 10 000 element micro- 
array the signal responses of Cy3 and Cy5 should be random 
due to the random physical location of the target elements. The 
signal response surfaces were first examined for non-random 
patterns. CT non-random patterns were detected, a second order 
response model was applied to model the gene signal 
responses according to their positions on the surface. The non- 
randomness was then corrected using the fitted model. The 
second procedure, a signal correction algorithm, corrected for 
differential rates of incorporation of the Cy3 and Cy5 dyes. In 
an idealized homotypic hybridization, a scatter plot of log Cy3 
signal versus log Cy5 signal should show a signal distribution 
along a line with a slope of 1. If the center line of the signals 
does not have a slope of 1 there may be different rates of 
incorporation of the Cy3 and Cy5 dyes. The signal correction 
algorithm tested whether the slope of the regression line for log 
Cy3 Signal versus log Cy5 signal was 1 and applied a regression 
model to rotate the regression line to a slope of 1 if necessary. 





2.0-1 




1.5- 




1.0- 




t 


0.5- 


Obs 


0.0- 


o 




ra 


-0.5- 


0 






-1.0- 




-1.5- 




-2.0- 



0.0625 



II 



0.125 
pg/well 



li 



0.25 
pg/well 



0.5 
pg/well 



1.0 
pg/well 



2 

pg/well 



-LS 0 15 -L5 0 L5 -IS 0 L5 -15 0 L5 -L5 0 15 -U5 0 L5 

Log^Q Input Differential Expression (Cy3/Cy5 Signal) 



Figure 1. Impact of input DNA concentration on differential gene expression. 
A dilution series of PCR product for three yeast control fragments was arrayed 

in triplicate in each of four quadranus. The amount of PCR product in the well prior 
to arraying is indicated above each panel. Input RNA ratios for labeling with Cy3 
versus Cy5 for the three fragments were 30; I.I : 1 and 1 :30, The log,o of observed 
differential expression is plotted as a function of log,n of input RNA ratios. 



RESULTS AND DISCUSSION 

Impact of arrayed DNA concentration on DEs 

Because of the competitive nature of two channel fluorescent 
hybridizations it has been assumed that the amount of target 
DNA deposited on the glass slide would have little or no 

impact on any observed DEs (15). We tested this assumption 
directly by hybridizing a series of samples at predetermined 
input ratios to microarrays containing varying amounts of 
target DNA. For these experiments the target DNAs were yeast 
fragments, a set of POT products derived from the non-coding 
regions of Saccharomyces cerevisiae. The amount of PCR 
product was quantified using a fluorescent dye (PicoGrccn) 
specific for double-stranded DNA. The targets were spotted in 
three sets containing quadruplicate points from a 2-fold 
dilution series of DNA concentrations ranging from 2.0 to 
0.062 fig/well (lOfil/well). 

Probes for hybridization to the yeast fragments were made 
from T7 RNA transcripts of PCR products. Templates for in 
vitro transcription were made by incorporating a T7 promoter 
in the upstream PCR primer and poly(A) sequences in the down- 
stream PCR primer. In vitro transcripts of the yeast fragments 
were purified, quantified and included in every labeling reaction 
at predetermined Cy3:Cy5 input levels (fragment 22, 123:4 pg; 
fragment 6, 123:123 pg; fragment 25, 4:123 pg). All probe 
labeling reactions were done in the presence of 200 ng poly (A) 
mRNA, from either human brain or heart (Biochain, Hay ward, 
CA). Hybridization of these probes was perfonncd on three 
different days, across 20 microarrays representing two 
different batches and by multiple operators. A comparison of 
the expected differential expression and the experimentally 
observed differential expression is shown in Figure 1. These 
results indicate that target DNA arrayed at input concentrations 



<1 .0 Jig/ 10 M-l results in an underestimate or compression of the 
observed differential expression, with more compression 
occurring at lower DNA concentrations. 

Quantification of DNA amplimers on the array by a 
hybridization-independent method 

The DNA concentration of the input printing solutions may not 
be directly predictive of the amount of DNA actually retained 
on the glass. Variations in the transfer efficiency of individual 
DNA sequences to the glass and variations in its subsequent 
retention through the post-arraying and processing procedures 
may have an impact on the amount of DNA retained. TTierefore, a 
second DNA staining assay was developed using SYTO 61 
fluorescent dye, which directly measured the amount of DNA 
actually retained on the glass, independent of hybridization. 

Qualification of 10 000 element cDNA microarrays 

Based on the preliminary experiments we applied the 
PicoGreen and SYTO 61 assays to evaluate two independent 
10 000 element microarrays (Fig. 2). Each of the 104 96-well 
plates used to print the arrays was qualified by PicoGreen 
analysis and all plate sets had high levels of PCR amplimer 
(>1.0 fig/well) (Fig. 2A). The plate sets used to prepare the 
HGGl arrays, however, had a greater overall average DNA 
concenu-ation than those used to prepare the UGVl arrays: 
median 3.6 versus 1.85 fig/well, respectively. 

An array from each batch was hybridized with a complex 
cDNA probe derived from placenta RNA in both the Cy3 and 
Cy5 channels. SYTO 61 staining was performed on an additional 
array from each batch and a comparison of the signal outputs 
for SYTO 61 and hybridization probes for both array batches is 
shown in Figure 2B and C. Observed hybridization signals 
were generally higher for the HGGl array (Fig. 2C) as 



Page 4 of 9 



€41 Nucleic Acids Research 2001, Vol. 29, No. 8 



I 



60 

50 
40 
30H 
20 
10 
0 




-UGV1 
-HGG1 



C 100.000: 



, t ft, 

) 2 4 e 6 
g DNA/well (Picogreen) 



10 



10,000-: 



1.000: 



100: 



Cy3 
Cy5 




100 1,000 

Syto Signal (RFUs) 



10.000 



B lOO.OOOi 



£ 10,000i 



(0 1.000: 



lOOl 



Cy3 
Cy5 




10-^ 



100 1.000 
Syrb signal (RFUs) 



10.000 



Q 2.000 




Hybridization Signal (Loa „ RFUs) 



Figure 2. Quality control analysis of microarniy batches. A set of eight wells randomly selected from each of 104 96- well plates from microarray types UGVl and 
HCCl was analyzed with PicoGreen. The distribution of DNA concentrations is shown in (A). The amount of hybridization signal with a complex probe (Cy3 
Brein/Cy5 Heart) is shown as a functioii of the amount of DNA retained on the glass for miorDarray types UGVt (B) and HGGI (C). Signal distributions from 
hybridizations with a vector-specific oligonucleotide probe for each array type are shown in (D). 



compared to the UGVl array (Fig. 2B): median Cy3 1049 
versus 310 relative fluorescence units (r.f.u.). median Cy5 
1137 versus 302 r.f.u., respectively. This was consistent with 
the higher amount of DNA on the glass for the HGGI array: 
median 2532 versus 1905 r.f.u. Higher hybridization signals 
(>10 000 r.f.u,) were routinely observed when the amount of 
target DNA bound to the glass approached 2000 r.f.u. by 
SYTO 61 staining (data not shown). In the examples shown, 
35*%^ of the elements on the UGVl microarray have SYTO 61 
stain values <2000 r.f.u., as compared to only 9% of the 
elements on the HGGI array. There was an apparent discrepancy 
in the UGVl microarray, 65% of all elements on the UGVl 
array having higher levels of bound DNA but few yielding 
hybridization signals >10 000 r.f.u.. 

To address this issue a third assay was developed. An array 
from each batch was hybridized with a Cy3-iabclcd oligo- 
nucleotide probe specific for the common vector sequence 
found in all the PGR products. The signal distribution for these 
vector hybridizations is presented in Figure 2D. The majority 
of elements on the UGVl microarray had signiricantly lower 



hybridization signals than the HGGI array: median 1901 
versus 6507 r.f.u. These results correlated better with complex 
probe hybridization than SYTO 61 staining (Fig. 2B and C). 

The manufacture of high quality, reproducible arrays with 
10 000 or more unique PGR products is an expensive and time- 
consuming effort. It requires considerable attention to the 
details of each step in the process and defined procedures to 
ensure quality and reproducibility. The data presented in this 
report show that low concentrations of DNA in the input 
printing solutions result in reduced amounts of arrayed DNA 
and this, in turn, reduces the dynamic signal range and 
produces an apparent compression or underestimation of 
differential expression. The assay procedures reported here 
have t)een implemented in the large-scale production of micro- 
arrays for use in generating expression databases. 

mRNA input 

The impact of varying the amount of input mRNA on net 
cDNA probe synthesis and hybridization was evaluated. 
Placental mRNAs of varying amounts (25-400 ng) were 
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Figiire 3k mRNA titration and balance. (A-C) Probe fluorescent signals, signsl-to-backgnxind and dynamic range as a function of input mRNA mass. Duplicate 
labeling reactions containing equal amounts of placenta mRNA in both the Cy3 and Cy5 channels were labeled and hybridized to OniGEM V2 arrays. Each data 
point is an average from the two hybridizations. Probe fluorescence signal was converted to moles product using a standard curve. Range minimum values 
remained between 100 and 200 U for all hybiidizations. (D and E) An aliquot of SO or 400 ng placenta mRNA was labeled with Cy3 and hybridized to either 400 
or 50 ng mRNA labeled with Cy5, respectively, in duplicate. Only one of the two hybridizations is shown. The axes are in arbitrary fluorescent units. 



labeled with Cy3 and hybridized to an equal aliquot labeled with 
Cy5. Increasing the placental mRNA input yielded increasing 
aniounts of total cDNA product (Fig. 3A). Hybridi^tion signal* 
to-background and dynamic range also increased as the mRNA 
input increased, although a clear point of 'diminishing returns* 
Occurs above 200 ng mRNA input (Fig. 3B and C). Based on 
this mRNA titration series, we believe that using 200 ng 
mRNA as the standard input for labeling reactions is the 
optimal amount. A representative example of a competitive 
hybridization with balanced RNA inputs (200:200 ng) is 
presented in Figure 4 A. 

We tested the effect of unbalanced competitive hybridization 
by hybridizing product prepared from different input levels of 
placental mRNA in the labeling process (Fig. 3D and E). We 
observed significant loss in precision and a distortion of the 



population from the theoretical DE of 1 , especially in the lower 
signal range. This distortion reflects both the impact of differ- 
ential labeling and hybridization of transcripts with different 
amounts of mRNA input Reversing the ratio of input mRNA 
for probe synthesis resulted in the opposite curvature (Fig. 3E), 
We conclude that accurate quantification and use of an 
equivalent mRNA mass for labeling in both channels is essential 
for optimum results. 

Homotypic response 

An estimate of the accuracy and precision of array-generated 
expression data was first made by performing a series of 
replicate experiments using various homotypic hybridizations. 
A competitive hybridization of fluorescently labeled Cy3 and 
Cy5 cDNA, both prepared from the same placental mRNA, 
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Figure 4. (A) Scatter plot of the calibrated Cy3 versus Cy5 fluorescence 
response from a typical placenta: placenta hybridization. The diagonal line through 
the origin conesponds to the expected DEof 1. The other diagonal lines define DE 
values as indicated next to the line. (B) Histogrann showing the distribudon of 
elements by iog. of their experimentally derived DEs for 10 homotypic placental 
hybridizations. 



should theoretically give a DE (or Cy3 fluorescence divided by 
Cy5 fluorescence) of 1 for all 10 000 elements arrayed on the 
slide. With replicate hybridizations we can evaluate the overall 
precision of the data using various statistical parameters and 
obtain an estimate of accuracy from any deviation(s) observed 
from the theoretical value. 

A scatter plot of the Cy3- versus Cy5 -calibrated fluorescent 
response from a single placenta: placenta hybridization is 
shown in Figure 4A. Virtually all gene elements lie close to the 
diagonal line corresponding to the expected DE of 1, Overall 
system response was observed to be linear over about three 
orders of magnitude. 

Approximately 100 000 data points from 10 homotypic 
placenta hybridizations were used to constnict a histogram showing 
Ihe frequency or distribution of gene elements (as a percentage of 
ttie total) around log^ of the expected DE (In 1.0 = 0). Effectively, 
the histogram (Fig. 4B) is a graphical measure of the range of 
the signal response for each selected element. The coefficient 



of variation (CV), or relative standard deviation, provides a 
quantitative estimate of the precision of differential expression, 
The calculated CV for differential expression for all elements 
was 12% across the entire signal range. The same 12% variance 
was observed across two independently manufactured batches 
of cDNA mscroarrays (data not shown). 

Ten similar homotypic hybridization experiments were 
conducted with both human brain and heart samples and the 
data were confipared to the placenta results described above. 
Results for both sets of hybridizations were identical (data not 
shown). The same 12% CV for differential expression was 
observed independent of tissue type over the entire signal 
range. 

Accuracy, in terms of bias, was estimated by calculating an 
average experimental DE directly from observed fluorescence 
output and comparing it to the expected value of 1 .00. For each 
of the three tissue types above (placenta, brain and heart) the 
average (n = 10) experimental DE values were 0.999983, 
0.99977 and 0.9998, respectively. The overall average was 
0,9999, or less than 1 part in 10 000, These values arc in good 
agreement not only within the group, but also with the 
expected theoretical value of 1 .00. 

The observed variation in individual element responses 
(from the expected DE = 0) for 1 80 randomly selected genes 
across the full range of observed signal response (as a function 
of Cy5 signal) is shown in Figure 5A-C for placenta, brain and 
heart tissue. For each of the 180 elements selected all 10 
replicate data points are plotted for each gene from each tissue 
type. Regardless of tissue type we observed few data points 
with a differential expression greater than 2, even at low 
overall signal levels. 

From the above data we can calculate the change in DE 
required before the value has statistical significance. Mathe* 
matically this can be written in terms of the two-sided statistical 
tolerance interval for the differential expression of non- 
differentiated elements (16). A statistical tolerance interval is 
one that contains a specified portion, p, of the entire sampled 
population with a specified degree of confidence, 100(1 
Table 1 shows the 99.5% tolerance intervals for 99% of the 
elements from each tissue type: all observed differential 
expression values fall between ±1.4. 

Analysis of variance (ANOVA) was used to estimate the 
contribution of specific potential sources of variance to the 
overall variance measured. Analyses were performed using the 
method of restricted maximum likelihood under SAS for 
Windows v.6.12 procedure PROC MDCED (17), All of the 
homotypic placenta, brain and heart data sets were used for this 
analysis. 

There are four general sources of variation in the DE ratios: 
microarray batch, array-to-array hybridization variance 
(including sample preparation), biological source tissue and 
gene sequence variance. Table 2 lists the estimated contribution of 
these potential sources of variation to the overall variance 
measured. The two sources contributing most significantly to 
the overall variation were hybridization variance and sequence 
variance. Hybridization variance represents a source of variation 
from hybridization to hybridization. Sequence variance indicates 
that different elements demonstrate different levels of variation. 
Microarray batches and source tissues were not significant 
sources of variance. 
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Figure 5. Variation in individual clement responses for 180 randomly selected 
genes over the full range of observed signal response (expressed as log CyS 
signal). All 10 replicate dau points for each selected element ore plotted along 
the vertical axis. Hprizpntal lines define die tolerance interval outside of which 
DE was deemed significant (see text). (A) Homotypic placental hybridizations. 
(B) Homotypic brain hybridizations. (C) Homotypic heart hybridizations. 



DifTerential expression 

Using placental mRNA as a common reference, four sets of 
experimental conditions to measure differential expression 
were evaluated. Each set contained 10 replicate hybridizations: 
brain:p!acenta, placenta:brain, heart:placenta and placenU:heart 
Estimates of system precision and detection limits were made 
as described above for the homotypic hybridizations. 



Source 


Tolerance interval 


Placenta:placenta 


(-1.332, 1.332) 


Brain :brain 


(-1.397. 1.397) 


Hearuheart 


(-1.384, 1.384) 


All combined 


(-1.370. 1.370) 


The 99.5% tolerance intervals contain at least 99% of the elements on each 


niicroarray. 




Ihble 2. Variance component estimation for homotypic hybridizations 


Variation source 


Estimated CV contribution 


Microarray batch 


0.0% 


Source tissue 


0.0% 


Hybridization 


7.8% 


Gene sequence 


9.4% 


Ibtal 


12.0% 



ANOVA was performed on placenta, brain and heart homotypic hybridizations. 




Figure 6. Scatter plot of Cy3-labeled cDNA from heart (x-axis) hybridized to 
the array with CyS-Iabeled cDNA from placenta (y-axis) (single experiment). 
Compare with Figure 4A. 



Figure 6 shows the fluorescence response plot of a single 
representative experiment conducted with Cy3-labeled cDNA 
from heart competitively hybridized to the array with Cy5-labeled 
cDNA prepared from placenta. Most of the elements (>90%) 
fell on of close to the 45" line representing no differential 
expression (or DE = 1 ,00). However, in contrast to the homotypic 
hybridizations (Fig. 4A). 10% of the elements were also 
observed to fall outside the tolerance interval, which may 
indicate significant differential expression (Table 1). 

From 10 such replicate experiments in this set we calculated 
a CV for each of the 10 000 elements and plotted the values 
against the overall dynamic signal range (as a function of log 
CyS fluorescence signal) as shown in Figure 7A. The average 
CV was observed to be 10-12% across the entire signal range, 
although there was slightly greater variation at low signal 
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Figure 7. (A) CV for each of 10 000 elements derived from 10 replicate 
hcart:placenta iiybridizations plotted as a function of the average bt»erved 
signal (as CyS signal). (B) CV for the same 10 000 elements plotted as a function 
of log„ of the average observed DE (In DE). 



Figure 8. Reciprocal labeling experiments showing the data plotted from 
180 random elements from (A) 10 replicate brain:placcnta (black symbols) and 
10 replicate placenta: brain (blue symbols) hybridizations versus log DE, and 
(B) 10 replicate heart:piacenta (black) and 10 replicate placenta: heart (blue) 
hybridizations. 



levels. Figure 7B shows the CV for the same 10 000 elements 
above plotted as a function of average DE. Most elements are 
observed to cluster near the value 0, indicating no differential 
expression. However, the CV of 12% observed for non- 
differentiated elements, on average, was slightly smaller than 
the CV for differentiated elements in either direction. The 
observed average CV ranged from 12% for non-difFcrcntiatcd 
elements to a maximum value of 25% for elements differentially 
expressed by a factor of 100. Since the DE is a ratio of the 
signals from the two channels, variations in the denominator at 
lower signal levels have a larger impact. Despite these minor 
differences, overall system precision remains excellent. 

The same 1 80 random elements in Figure 5 were evaluated 
in 'reciprocal dye labeling* experiments. Theoretically, the 
Cy3- and Cy5-labeled primers should function equivalently for 
cDNA synthesis. However, any differences in incorporation of 
label would, if significant, identify differential expression 
where none exists. It could also account for some of the 
variation we observe in the different parameters evaluated in 
this study. Therefore, we performed a series of additional 
experiments specifically designed to address this issue. 

The data fttm 1 0 replicates of the brainiplacenta hybridizations 
were compared to the data from 10 replicates of the reciprocally 
labeled placenta: brain hybridizations. Figum 5A shows a plot 



of the DE for 1 80 random elements from both sets of data. The 
DE for any given element in the first set of hybridizations 
should simply be the reciprocal of the DE for the same element 
in the second set (when the labeling is reversed). As Figure 8A 
shows, the cluster of 10 data points for each element from set 1 
lies the same distance above the horizontal line through logjo 
1.0 = 0 as the corresponding cluster from set 2 lies below it. 
Figure 8B shows a similar plot generated from 20 microarrays. 
where 10 heart: placenta hybridizations were compared to the 
reciprocaDy labeled placenta:heart hybridizations, witfi essentially 
equivalent results. 

For each element we can define the axial symmetry of reflection 
(ASR) as the inflection point between the DEs from the 
reciprocal labeling experiments, calculated by averaging the 
two DE ratios. C!alculated average ASR values of 0.998 and 
0.999 were obtained from the placenta: brain and placenia:heart 
data sets, respectively, in good agreement with the theoretical 
value of 1 .00. Thus any systematic bias introduced into the DE 
by reciprocal labeling must be less than 1-2 parts in 1000. 
These results independently verify the precision in measuring 
differential expression, as well as in identifying those genes 
that are not differentially expressed. Histograms showing the 
distribution of all elements (as a percentage of the total) as a 
function of hi ASR (Fig. 9 A and B) were similar to the histogram 
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Figure 9. Histograms showing the distribution of all elements as a function of 
In ASR from reciprocal labeling experiments. (A) Data for brain :placenta and 
placenta:brain hybridizations. (B) Data for heait:placenta and placentaiheart 
hybridizations. 

observed for non-differentiated elements (Fig. 4B). They also 
had the same standard deviation. Therefore, any variation 
observed in DE was likely a result of real variations in experi- 
mental mRNA levels, rather than an artifact of the labeling 
system. 

A series of independent yeast standards was also included on 
each microairay to assist in evaluating overall system perform- 
ance. These controls demonstrated linearity in overall signal 
response over three orders of magnitude, a CV of 12% and a 
limit of detection of 2 pg mRNA at a signal-tc>background 
ratio of 2.5 (data not shown). 

CONCLUSION 

In this report we have described measures important in the 
manufacture of cDNA microairays and in the preparation and 
labeling of mRNAs for use in a two-channel hybridization 
system. Furthermore, the results presented in this report 
demonstrate in a quantitative fashion the performance of the 
cDNA microarray technology platform. The usefulness of any 
expression database is ultimately dependent on the quality of 
the underlying data used to construct it. We report that the 
cDNA microarray platform does provide the high quality data 
needed to establish reliable gene expression databases. 



The analytical methods used to evaluate the performance of 
the cDNA microarray platform described in this report provide 
a practical framework for evaluating the performance of other 
technologies that purport to measure global mRNA expression. 
Only by disclosing the performance characteristics in a 
rigorous manner can researchers gauge the utility of any data 
produced by other platforms. 
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reached. This approach was used to estimate relative 
maximum leaf size during the period of study (Fig< 3). 

29. The threshold for thermal damage of nonsucculent 
ieaves (45" to SZ'C) is a highly conserved character- 
istic across a wide range of extant taxa [W. Larcher, In 
Ccophyshtogy of Photosynthesis t E. D. Schultze and 
M. M. Caldwell. Eds. (Springer-Verl^ Berlin, 1994), 
pp. 261-277; Y. Causiaa, Holarct. ScoL 7, 1 (1984)]. 
implying little evolutionary change through time. 

30. T. A. Mansfield. A. M. Hetherlngton, C J. Atlclnson, 
Annu, Rev, Plant Physiol. Plant Mo/. Biol 41, 55 
(1990). 

31. A review of fossil Ginlcgoalean leaves revealed that 
species with the most dissected leaves, characterized 
by muitidlchotbmies 0.5 to 2 mm wide^ are restricted 
to Late Trlassk to early Middle Jurassk fades [T. 
KImura. C. Nalto. T. Ghana. BiHL Natl, 5cl, Mus, 
Tokyo 9. 91 (1983)]. 



32. The cause of T-J floral turnover has traditionally been 
attributed to a sedimentary hiatus (3). However, this 
hypothesis Is unsupported by sedimentological evi- 
dence [C, Dam and F. Surfyk, Geology 20. 749 (1992); 
Spec. Publ, Int, Assoc. Sedlmentol, IB, 4189 (1993)]. 
Which identifies no major fades changes or uncon- 
formities between the T-J strata In Greenland. Fur- 
thermore, the absence of the upper Rhaetlan Rtccils- 
porites-Polypodlsporites acme zone (W. M. L Schuur- 
man, ^ev. Palaeobot, Patynol, 23, 159 (1977)] In 
Greenland (70) and Sweden (77). which has also been 
tentath/ely Interpreted as evidence of a hiatus at 
both localities, is questionabie. as acme zones are 
generally considered of only local use, owing to the 
effects of ecological, environmental, and postdeposi- 
tional processes on relative poUen abundances. 

33. The value of b^K is 



8"C - ([("C^«C^)/(»C,„/"C«,)]-1}X1000 
where unk the ratio of unknown to sample and std Is 
the ratio of the pee dee belemnite standard. 
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Gene Expression Profile of 
Aging and Its Retardation by 
Caloric Restriction 

Cheol-Koo Lee/'^ Roger C, Klopp,^ 
Richard Weihdruch/* Tomas A. ProUa'* 

The gene expression profile of the aging process was analyzed In skeletal muscle 
of mice. Use of high-density oligonucleotide arrays representing 6347 genes 
revealed that aging resulted in a differential gene expression pattern indicative 
of a marked stress response and lower expression of metabolic and blosynthetic 
genes. Most alterations were either completely or partially prevented by caloric 
restriction, the only intervention known to retard aging In mammals. Tran- 
scriptional patterns of calorie-restricted animals suggest that caloric restriction 
retards the aging process by causing a metabolic shift toward increased protein 
turnover and decreased macromolecuiar damage. 



Most multicellular organisms exhibit a pro- 
gressive and ineversible physiological de- 
cline that characterizes senescence, the mo- 
lecular basis of wliich remains unknown. 
Postulated mechanisms include cumulative 
damage to DNA leading to genomic instabil- 
ity, epigenetic alterations that lead to altered 
gene expression patterns, telomere shortening 
in replicative cells^ oxidative damage to ciit- 
ical macromolecules by reactive oxygen spe- 
cies (ROS), and nonenzymatic gly cation of 
long-Kved proteins (i, 2). 

Genetic manipulation of the aging process 
in multicellular organisms has been achieved 
in Drosophila through the overexpression of 
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catalase and Cu/Zn superoxide dismutase (i), 
in the nematode Caenorhabditis efegans 
through alterations in the insulin receptor sig- 
naling pathway (4)y and through the selection 
of stress-resistant mutants in either organism 
(5). In mammals, mutations in the Werner 
Syndrome locus (WRN) accelerate the onset 
of a subset of aging-related pathology in hu- 
mans (^), but the only intervention that ap- 
pears to slow the intrinsic rate of aging is 
caloric re'trietion (CR) (7). Most studies 
have involved laboratory rodents which, 
when subjected to a long-term, 25 to 50% 
reduction in calorie intake without essential 
nutrient deficiency, display delayed onset of 
age-associated pathological and physiologi- 
cal changes and extension of maximum life- 
span. Postulated mechaiiisms of action in- 
clude increased DNA repair capacity, altered 
gene expression, depressed metabolic rate, 
and reduced oxidative stress (7). 

To examine the molecular events associ- 
ated with aging in mammals, we used oligo- 
nucleotide-based arrays to define the tran- 
scriptional response to the aging process in 
mouse gastrocnemius muscle. Our choice of 
tissue was guided by the fact that skeletal 
muscle is primarily composed of long-lived, 
high oxygen-consuming postmitotic cells, a 



feature shared with other critical aging targets 
such as heart and brain. Loss of muscle mass 
(sarcopenia) and associated motor dysfunc- 
tion is a leading cause of frailty and disability 
in the elderly (8), At the histological level, 
aging of gastrocnemius muscle in mice is 
characterized by muscle cell atrophy, varia- 
tions in size of muscle fibers, presence of 
lipofuscin deposits, collagen deposition, and 
mitochondrial abnormalities (9). 

A comparison of gastrocnemius muscle 
from 5-month (adult) and 30-month (old) 
mice (10-12) revealed that aging is associat- 
ed with alterations in mRNA levels, which 
may reflect changes in gene expression, 
mRNA stability, or both. Of the 6347 genes 
surveyed in the oltgonucleotide microarray, 
only 58 (0.9%) displayed a greater than two- 
fold increase in expression levels as a fiinc- 
tion of age, whereas 55 (0.9%) displayed a 
greater than twofold decrease in expression, 
Tliese findings are in agreement with a dif- 
ferential display analysis of gene expression 
in tissues of aging mice {13), Thus, the aging 
process is unlikely to be due to large, wide- 
spread alterations in gene expression. 

Functional classes were assigned to genes 
displaying the largest alterations in expres- 
sion (Table 1). Of the 58 genes that increased 
in expression with age, 16% were mediators 
of stress responses, including the heat shock 
factors Hsp71 and Hsp27, protease Do, and 
the DNA damage-inducible gene GADD45 
{14), The largest differential expression be- 
tween adult and aged animals (a 3.8-fold 
induction) was observed for the gene encond- 
ing the mitochondrial sarcomenc creatine ki- 
nase, a critical target for ROS-induced inac- 
tivation {15), 

A consequence of skeletal muscle aging is 
loss of motor neurons followed by reinnerva- 
tion of muscle fibers by the remaining intact 
neuronal units {16), Genes involved in neuronal 
growth accounted for 9% of genes highly in- 
duced in 30-month-old animals, including neu- 
rotrophin-3 (77), a growth factor induced dur- 
ing reinnervation, and synaptic vesicle protein- 
2, implicated in neurite extension {18), PEA3, a 
transcriptional factor induced in the response to 
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muscle injuiy and previously shown to be high- 
ly expressed in muscle finom old rats (/P), was 
also induced in aged muscle. We also observed 
parallels between our results and data fifom 
fibroblasts undergoing in vitro leplicative se- 
nescence. For example, HIC-5« a transcriptional 
factor induced by oxidative damage, and insu- 
lin-like growth factor binding protein, both as- 
sociated with m vitro senescence (20)^ are in- 
duced in aged skeletal muscle. 

Fifty-five (0.9%) genes displayed a greater 
than twofold age-related deaiease in expres* 
sion. Genes involved in energy metabolism ac- 
counted for 13% of these alterations (Table 1). 
These include alterations in genes associated 
with mitochondrial function and turnover, such 
as the adenosine 5 '-triphosphate (ATP) syn- 
thase A cham and nicotinamide adenine dinu- 
cleotide phosphate (NAD?) transhydrogenase 
genes (both involved hi mitochondrial bioener- 
getics), the LON protease implicated in mito- 



chondrial biogenesis, and the ERVl gene in- 
volved in mitochondrial DNA (mtDNA) main- 
tenance (21). Additionally, a decrease in meta- 
bolic activity is suggested through a decline in 
the expression of genes involved in glycolysis, 
glycogen metabolism, and the glycerophos- 
phate shunt (Table 1). 

Aging was also characterized by large re- 
ductions (twofold or more) in the expression of 
biosytithetic enzymes such as squalene syn- 
thase (fatty acid and cholesterol synthesis), 
stearqyl-coenzyme A (CoA) desaturase (poly- 
unsaturated fatty acid syntliesis), and EF-1- 
gamma (protein synthesis). This suppiession 
was accompanied by a concerted decrease in 
the expression of genes involved in protein 
tumover, such as the 205' proteasome subunit, 
the 265" proteasome component TBPl, ubiq- 
uitin-thiolesterase, and the Unp ubiquitin-spe- 
cific protease, all of which are involved in the 
ubiquitin-proteasome pathway of protein tum- 



over (22). The directions of changes in other 
fiinctional categories, such as signal transduc- 
tion, and transcriptional and growth factors, did 
not present a consistent age-related trend. 

In order to study the effects of CR on the 
gene expression profile of aging, we reduced 
caloric intake of C57BL/6 mice to 76% of 
that fed to control animals in early adulthood 
(2 months of age), and this dietary regimen 
was maintained until animals were killed at 
30 months, A comparison of 30-month-old 
control and CR mice revealed that aging- 
related changes in gene expression profiles 
were remarkably attenuated by CR. Of the 
largest age-associated alterations (twofold or 
higher), 29% were completely prevented by 
CR and 34% were partially suppressed (Table 
1). Of the four major gene classes that dis- 
played consistent age-associated alterations, 
84% were either completely or partially sup- 
pressed by CR. Thus, at the molecular level. 



Table 1 (left). Aging-reUited changes In gene expression In gastrocnemius 
musde. The extent to which caloric restriction prevented age-associated alter- 
ations In gene expression is denoted as either C (complete^ >90%), N (none), or 
partial (20 to 90%, percentage effect Indicated). The fold Increase shown repre- 
sents the average of all nine possible palrwlse comparisons annong Individual 
mice determined by means of a spedfic algorithm (72). GenBank acoesskm 
numbers are listed under ORF. A more comprehensfve list that Includes genes 



ORF 



A Age 
(fold) 



Gene 



W0e057 t3,5 Heat Shock 27 KOa Protein 

Ml 7790 t3.5 SenjmAmyloMAIaotonn4 

AA1 14576 t 3.4 Heat Shock 71 kDa Protein 

L28177 T2.6 QADD45 

M74570 T2.4 Aldehyde Dehydrogerane II 

AA0&9B62 t2.2 Protease Do PreoUTSoT 

L22462 T2.2 HIC-5 

X99863 T2.2 rtloB 

Xe5627 T2.I TNZZ 

X67277 TI.6 Raci 

AA071777 t3.6 SynapUc Vbstele Protein 2 

XS3267 T2.6 NeurotrophirvS 

X78197 T2.2 AP-2Beta 

X89749 t2.1 mTGIF 

AA014024 T2.I Dynaciln 

Xa3190 T2.I PEA3 

AA10ei12 ta.S Mitochondrial Safcofrertc 

Craatine Kinase 

AA06ie66 t2.0 Olhydropyridine-BenajUve 

L-type CaldUfTi Channel 

AA061310 14.1 Mttochondrtaf LON Protease 

W550a7 i2.9 Alpha Enotase 

V00719 iZ6 AJpha-Amylfl«o-1 

M81476 i2.6 Phosphoprotein Phosphatase 

AA034642 izA ERVl 

AA106406 1 2.0 ATP Synthase A Chain 

AA041828 i2.0 IPP-2 

L27B42 i2.0 PMP35 

Z49204 i2.0 NADPT^nshydfogenBse 

AAD71776 il.9 Qlucose-fi-Phosphate " 

Ml 3366 il.9 Glycerophosphate I 

AA107752 i2.9 EF-1-Qamma 

U22031 1 2.6 20s Proteasome Subunll 

AA061604 i2.2 llbiqultlo Thloieaterase 

AA1 45629 12.1 26S Proteaeome Component TBPl 

L006B1 4^2.1 Uhp UbkiuMn SpecMlc Proteese 

U36741 i2.0 Rhodanese 

063565 i 1 .7 Proteasome Z Subunit 

D76440 i2.9 Necdin 

X75014 1 2.7 Pho}(2 Homeodomatn Protein 

M32240 i2.1 GAS3 

M16466 i3.4 CalpBcUr) I Lloht Chain 

L34611 4.2.3 PTHR 

AA1 03356 i2.2 Calmodulin 

D29016 4.6.4 Squalene Synthase 

M212B5 i2.1 Steaioyl-CoA DasatUfBse 

U73744 i2.1 HSP70 



Function 



Chaperone 0 

Unknown N 

Chaperone C 

DNA damage response 77% 

Aldehyde detoxiKcation 20% 

Protease C 
Senescence and dIHerentiation C 

Unknown 87% 

RNA melaboitem 64% 

JNK activator C 

r^Gijrittt extension 61 % 

Rolnnervatlon of m iscle 60% 

Neurogenesis N 

DlirerentlBilort C 

Transport 55% 

Response to muscio in|ury C 

ATP generation C 

CakHum channel 67% 



MilochondrlBt biogenesis C 

Glycolysis 68% 

Carbohydrate metabolism N 

Glycogen metabolism C 

mtDNA maintenance 46% 

ATPsynlhesIs N 

Glycogen metaboilBm C 

F^fn^ofna assembly 60% 

Giycefophosphate shunt N 

Olycotysis C 

Glycerophosphate shunt C 

Protein syniheeis 63% 

Protein turnovor 44% 

Protein turnover C 

protein turnover C 

Prdteln tumovdr N 

Mltodiondrlal ixotoln Ibldliig C 

Protein turnover C 
Neuronal growth supprossor 47% 

Tlirophte factor 65% 

Myelin protein 65% 

Calcium sftoclor C 

Calcium homoostasie N 

Cnlclum effector N 
Cholesterol/fatty acid synthesis 62% 

PUFA synthesis C 

Chaperone N 



that did not fit into the six dasses can be found at www1.genetics.wi5cedu/ 
pmlla/Pmlla.Tables.html Table 2 (right). Caloric restriction-Induced alter- 
ations In gene expressioa The data represent a comparison between 30-manth- 
old CR-fed and control-fed mice. The gene expression alterations listed in this 
Table are diet related and do not Include those representing prevention of 
age-assodated changes (see Table 1), Additional CR-induced changes are posted 
at the aforementioned Web site. 



CR 
Prevonllon 



ORF 



ACR 
{fold) 



Gone 



Function 



U05809 t4.5 Transketolase 

W63351 t4,1 Fructose-blsphosphaie Akioiase 

AA071776 t3,6 Qkicose-d-Phosphate isomeraae 

U34205 t2.3 Qlucose Dependent insufinotroplc Pplypeptlde 

U01B41 t2.3 Peroxisome Prollferetor Receptor Qamnia 

1^6116 t2.0 PPAR Delta 

D42063 tl.e Fructose l^blsphosphalase 

AA041828 tl.9 Protein Phosphatase inhibitor 2 (IPP'2) 



U37091 
M13366 

AA1 19866 

AA14E829 

AA1077B2 

W53731 

U60328 



tl.8 Carbonic Anhydraae IV 

T 1.8 GlyceiophosphBte Dehydrbgenase 



W062B3 

W57495 

X13135 

X16314 

AA1 37659 

L32073 

X66548 

AA022063 

D7B440 



t1.7 
t 2.3 
T2.2 
t2.1 
T2.1 
T2.0 
T1.9 
Tl.6 
T4.7 
T2.5 
t2.4 
T2.0 
t2.0 
T2.0 
Tl.9 



Pyruvate Kinase 

26S Protease Subunll TBP-1 

Elongation Factor Vgamnna 

Signsi Recognitor) Receptor Alpha Subunit 

Proteasome Activator RA26 Alpha Subunit 

mCyP-Sl (Cyckjphiiin) 

Trantfocon- Associated Protein Delta 

60S nibosomal Protein L23 

Fatty Acid Synthase 

GMamlne Synthetase 

Cytochrome P450-iiC 1 2 

Thymidylate Kinase 

Purine Nucieoskie Phosphorylase 

HunUngtln 



AAOQ2320 i3.4 DnBJHomok>o2 

X63023 il.O Cytochrome P*450-IIIA 

U032B3 i1.8 Cyplbl Cytochrome P460 

U14390 il.8 Akiehyde Dahydrooenase^ 

X7B850 il.8 MAPKAP2 

D261 23 i 1 ,7 Carbonyt Reductase 

L4406 4.1.7 HsplOS-beta 

U40930 il.6 Oxklaliva StiBSS-induced Protein 

U68887 il.6 RAD60 

AA059716 il.7 DNA Polymerase Beta 

W42234 i\.6 XPE 

D43884 >tl.B Math-1 

D16464 i^.7 HES-1 

^13101 i^.B Thyroid Hormone Receptor Aipha-2 



I Energy Metabolism 
I NeuiDnal factoiu 



I Protein Metabolism 
I Stress Response 



Pentose phosphate pathway 
Giycolysis/Giuconeogenofiia 
Glycoiysis/GluconeogBnesia 
Insulin sensitizer 
Insulin sensitizer 
Peroxisome induction 
Gluconeogenesis 
tnhlblUon ol glycogen 

synthesis 
CO2 disposei 
Eieciron transport to 

mitochondria 
Glycolysis 
Proioln lurnover 
Protein synlho.sls 
Proioln synthesis 
Protein lurnover 
Protein lolding 
Protein lransk)catlon 
Protein synihesis 
Fatty acid synthesis 
Glulamine synthesis 
Sleroid blosynlhosis 
dTTP synthesis 
Purine turnover 
Urtknown 

Grow^ 5U|]|)rossur 

Chaperone 

Detoxification 

Detoxliicatlon 

Detox if tea lk)n 

Unknown 

Detwilication 

Chaperone 

Unkriov/n 

Double sirand break rRpnir 
tSastt excision ropeii 
NindoolidB oKclsion repair 

DIfieranllcitlon 
Difiorentlalldn 
Thyroid hormone receptor 

HH Biosynthesis 

DNA Repair 



I Energy Metaboiism 
I Neuronal Faciors 



I Protein Metabolism 
I Stress Response 



I Biosynthesis 

I Caidum Metabolism 
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CR mice appear to be biologically younger 
than animals receiving the control diet. 

Caloric restriction induced a metabolic 
reprogramming characterized by a transcrip- 
tional shift toward energy metabolism, in- 
creased biosynthesis, and protein turnover 
(Table 2). CR resulted in the induction of 51 
genes (1.8-fold or higher) as compared with 
age-matched animals consuming the control 
diet. Nineteen percent of genes in this class 
are related to energy metabolism. Modulation 
of energy metabolism was evident through 
the induction of glucose-6-phosphate isomer- 
ase (glycolysis), Iructose l,6-bispho5phata5e 
(gluconeogenesis), IPP-2 (an inhibitor of gly- 
cogen synthesis), and transketolase. Fructose 
1,6-bisphosphatase switches the direction of 
a key regulatory step in glycolysis toward a 
biosynthetic precursor, glucose-6-phosphate. 
Remarkably, this same adaptation has been 
observed as part of the transcriptional repro- 
gramming of Sacchawmyces cet^isiae ac- 
companying the diauxic switch from anaero- 
bic growth to aerobic respiration upon deple- 
tion of glucose (23). Transketolase, which 
controls the nonoxidative branch of the pen- 
tose phosphate pathway, provides NADPH 
for biosynthesis and reducing power for sev- 
eral antioxidant systems. CR also induced 
transcripts associated with fatty acid metab- 
olism, such as fatty acid synthase and PPAR- 
delta, a mediator of peroxisome proliferation. 
Interestingly, CR may act to increase insulin 
sensitivity through the induction of glucose- 
dependent insulinotropic peptide and PPAR- 
gamnia, a potent insulin sensitizer (24). 

Biosynthetic ability also appears to be 
induced in CR mice. CR up-regulated the 
expression of glutamine synthase, purine nu- 
cleoside phosphorylase (purine tiunover), 
and thymidylate kinase (dTTP synthesis). Re^- 
markably, 16% of transcripts highly induced 
by CR encode proteins involved in protein 
synthesis and tumoyer, including elongation 
factor 1 -gamma, proteasome activator PA28, 
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translocon-associated protein delta, 60^ ribo- 
somal protein L23, and the 26^ proteasome 
subunit TBP-1. 

CR was associated with a 1 .6-fold or greater 
reduction in expression of 57 genes, Of these, 
12% were associated with stress responses or 
DNA repair padiways, or both (Table 2). 
Among the 6347 genes examined, the most 
substantial suppression of gene exprcssion by 
CR was for a murine DnaJ homolog (3.4-fold), 
a pivotal and inducible heat shock factor that 
senses and transduces the presence of mis- 
folded or damaged proteins in bacteria (25). CR 
also lowered the expression of cytochrome 
P4S0 isofomis mA and Cyplbl (mvolved in 
detoxification), Hspl05 (a heat shock factor), 
aldehyde dehydrogenase (an inducible enzyme 
involved in detoxification of metabolic by- 
products), and an oxidative stress-induced pro- 
tein of unknown function. CR reduced the ex- 
pression of several DNA repair genes including 
XPE (a fector that recognizes multiple DNA 
adducts), RAD50 (involved in double-strand 
break repair), and DNA polymerase- beta (a 
DNA dainage-inducible polymerase). We also 
find molecular evidence to support a state of 
lower basal metabolic rate in CR mice through 
lowered expression of the thyroid-homione re- 
ceptor alpha gene {26), 

The data presented here provide the first 
global assessment of the aging process m 
mammals at the molecular level and under- 
score the utility of large-scale, parallel gene 
expression analysis in the study of complex 
biological phenomena. We estimate that the 
6347 genes analyzed in this study represent 5 
to 10% of the mouse genome. Additional 
classes of aging-related genes in skeletal 
muscle may be discovered with the develop- 
ment of higher density mammalian DNA mi- 
croanays. The observed collection of gene ex- 
pression alterations in aging skeletal muscle is 
complex, reflecting the presence of myocyte, 
neuronal, and vascular components. Although 
some of the age-associated alterations in gene 
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expression could represent maturational chang- 
es, this possibility is unlikely given the fact that 
the 5-month-old (adult) mice used in this study 
were fully mature animals. Importantly, chang- 
es in mRNA levels may not always result in a 
parallel alteration in protein levels. However, 
the complete or partial prevention of most age- 
related alterations by CR suggests that gene 
expression profiles can be used to assess the 
biological age of mammalian tissues, providing 
a tool for evaluating experijnental interventions. 

Taken as a whole, our results provide evi- 
dence that during aging there is an induction of 
a stress response as a result of damaged proteins 
and other macromolecules. This nesponse en- 
sues as the systems required for the turnover of 
such molecules decline, perhaps as a result of 
an enei'getic deficit in the cell. In particular, the 
observed alterations in transcripts associated 
with energy metabolism and mitochondrial 
function may reflect either decreased mitochon- 
drial biogenesis or turnover secondary to cumu- 
lative ROS-inflicted mitochondrial damage (i), 
lending support to the concept that mitochon- 
drial dysfunction plays a central role in aging of 
postmitotic tissues. The gene expression profile 
also suggests that secondary responses to the 
aging process in skeletal muscle involve the 
activation of neuronal and myogenic responses 
to injuiy. 

A summary of global changes induced by 
aging, and the contrasting effects of CR, are 
shown in Table 3. The transcriptional activation 
of stress response genes that process damaged 
or misfolded proteins during aging, and the 
prevention of this induction by CR, suggest a 
central role for protein modifications in aging. 
Indeed, aging is characterized by an exponen- 
tial increase of oxidatively damaged pix}teins 
(27). Previous analyses of metabolic rates in 
CR animals have led to the suggestion that this 
life-extending regimen acts tlirough a reduction 
in metabolic rate, resulting in a lower produc- 
tion of toxic by-products of metabolism {28). 
Vno CR-mediated reduction of mRNAs encod- 
ing inducible genes involved in metabolic de- 
toxification, DNA repak, and the response to 
oxidative stiiess supports this view, because it 
implies lower subsn'ate availability for these 
systems. Additionally, our analysis indicates 
that CR may cause a metabolic shif^ toward 
increased biosyndiesis and macromolecular 
turnover. A hormonal trigger for this sliift may 
be an alteration in the insulin signaling pathway 
through increased expression of genes that me- 
diate insulin sensitivity, a fuiding that links our 
obsei'vations to those obtained through the ge- 
netic analysis of aging in the nematode C. 
ehgmis {4), 
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Selenium is essential for male fertility in ro- 
dents and has also been implicated in the fer- 
tilization capacity of speimatozoa of livestock 
and humans (/). Selenium deficiency is associ- 
ated witti impaired sperni motility, structural 
alterations of the midpiece^ and loss of ilagel- 
lum (7). However, three decades after the dis- 
covery of selenium as an integral constituent of 
redox enzymes (2), the molecular basis of the 
relationship of the essential trace element and 
male fertility remains obscure. The selenopro- 
tein PHGPx (Enzyme Conunission number 
1.11.1.12) is abundantly expre^ed in speima- 
tids and displays high activity in pos4)ubertal 
testis (i). In mature sperawtozoa, however, se- 
lenium is largely restricted to the mitochondrial 
capsule, a keratin-like matrix that embeds the 
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helix of mitochondria in the sperm midpiece 
(4). A "sperm mitochondria-associated cys- 
teine-rich protein (SMCP)" (5) had been con- 
sidered to be the selenoprotein accounting for 
the selenium content of the mitochondrial cap- 
sule (4-6), The rat SMCP gene, however, does 
not contain an in-frame TGA codon (7) that 
would enable a selenocysteine incorporation 
(8), In mice, the three in-frame TGA codons of 
the SMCP gene are upstream of the translation 
start (5). SMCP can therefore no longer be 
considered as a selenoprotein. Instead, the "mi- 
tochondrial capsule selenoprotein (MCS)/' as 
SMCP was originally referred to (4-7% is here 
identified as PHGPx. 

Routine preparations of rat sperm mito- 
chondrial capsules (P) yielded a fraction that 
was insoluble in 1% SDS containing 0.2 mM 
dithiothreitol (DTT) and displayed a vesicu- 
lar appearance in electron microscopy (Fig. 
lA). The vesicles readily disintegrated upon 
exposure to 0.1 M mercaptoethanol (Fig. IB) 
and became fully soluble in 6 M guanidine- 
HCl. When the soliibilized capsule material 
was subjected to poly aery lamide gel electro- 
phoresis (PAGE), four bands in the 20-kD 
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Abstract 

Background: Ras is an area of intensive biochemical and genetic studies and characterizing downstream 
components that relay ras-induced signals Is clearly importont We used a systematic approach, based on 
DNA microarray technology to establish a first catalog of genes whose expression is altered by ras and, 
as such, potentially involved In the regulation of cell growth and transformation. 

Results: We used DNA microarrays to analyze gene expression profiles of ras^'^/EIA-transformed 
mouse embryonic fibroblasts. Among the ~ 1 2,000 genes and ESTs analyzed, 815 showed altered 
expression in ras^'^/EIA-transformed fibroblasts^ compared to control fibroblasts, of which 203 
corresponded to ESTs. Among known genes, 202 were up-regulated and 410 were down-regulated. About 
one half of genes encoding transcription factors, signaling proteins, membrane proteins, channels or 
apoptosls-related proteins was up-regulated whereas the other half was down -regulated. Interestingly, 
most of the genes encoding structural proteins, secretory proteins, receptors, extracellular matrix 
components, and cytosolic proteins were down-regulated whereas genes encoding DNA-associated 
proteins (involved in DNA replication and reparation) and cell growth-related proteins were up-regulated. 
These data may explain, at least in part, the behavior of transformed cells in that down-r^ulation of 
structural proteins, extracellular matrix components, secretory proteins and receptors is consistent with 
reversion of the phenotype of transformed cells towards a less differentiated phenotype, and up-regulation 
of cell growth-related proteins and DNA-assoclated proteins is consistent with their accelerated growth. 
Yet, we also found very unexpeaed results. For example, proteases and inhibitors of proteases as well as 
all 8 angiogenic Actors present on the array were dovm-regulated In transformed fibroblasts although they 
are generally up-regulated in cancers. This observation suggests that, in human cancers, proteases, 
protease Inhibitors and angiogenic factors could be regulated through a mechanism disconnected from ras 
aalvation. 

Conclusions: This study established a first catalog of genes whose expression is altered upon fibroblast 
transformation by ras^'^/EIA. This catalog is representative of the genome but not exhaustive, because 
only one third of expressed genes was examined. In addition, contribution to ras signaling of post- 
transcriptional and post-translational modifications was not addressed. Yet, the information gathered 
should be quite useful to future investigations on the molecular mechanisms of oncogenic transformation. 
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Background 

Cancer is a disease caused by multiple genetic alterations 
that lead to uncontrolled cell proliferation. This process 
often involves activation of cellular proto-oncogenes and 
inactivation of tumour-suppressor genes. One of the ear- 
liest and most potent oncogenes identified in human can- 
cer is the mutant raj [1,2]. Ras family of proto-oncogenes 
encodes small GTP'binding proteins that transduce mi- 
togenic signals from tyrosine-kinase receptors [reviewed 
in [3]). In vitro, oncogenic ras efficiently transforms most 
immortalized rodent cell lines but fails to transform 
mouse primary cells [4]. However, ras can transform pri- 
mary mouse cells by cooperating with other oncogenic al- 
terations such as overexpression of c-Myc, dominant 
negative p53, D-type cyclins, Cdc25A and Cdc25B, or loss 
of p53, p2 6 or IRF-l [5-7]. Several viral onco-proteins can 
also cooperate with ras, for example SV40 T-antigen, ade- 
novirus El A, human papillomavirus E7 and HTLV-1 Tax 
[reviewed in [6,7]]. When expressed alone in primary 
cells, most of these alterations facilitate their immortaliza- 
tion [7]. Oncogenic transformation of primary cells by co- 
expression of ra5 and immortalizing mutations constitutes 
a model of multistep tumorigenesis that has been repro- 
duced in animal systems [reviewed in [8,9]]. 

Ras has been an area of intensive biochemical and genetic 
studies [10]. These studies helped to characterize down- 
stream signaling events and components that relay ras-'xn- 
duced mitogenic signals to the ultimate transcription 
factors which regulate expression of genes involved in cell 
growth and transformation. Downstream signaling elicit- 
ed by the oncogenic form of Ras protein impairs regula- 
tion of gene expression with eventual disruption of 
normal cellular functions. Downstream transcription fac- 
tors were found essential for rfls-mediated cell transforma- 
tion [1 1-13]. However, compared with our knowledge on 
ras signaling events, little is known on target genes in- 
volved in the phenotypic changes resulting from ras acti- 
vation, such as cell transformation. Thus, identification of 
genes whose expression is altered during nw-mediated cell 
transformation would provide important information on 
the underlying molecular mechanism. In the present 
work, we used DNA microarray technology to analyze 
gene expression profiles of ras^^^/EiA-transformed pri- 
mary mouse embryonic fibroblasts (MEFs), in order to 
identify genes whose expression is transformation-de- 
pendent. 

Results 

Analysis of gene expression changes after ras^^^ IE I A" 
transformation 

We used microarray analysis to compare expression pro- 
files of ~ 12,000 genes in normal vs. rasVi2/E2A-trans- 
formed fibroblasts. Figure 1 shows the phenotypic 
changes of the rasVi2/ElA-transformed MEFs. With Af- 



fymetrfac microarray technology, differential expression 
values greater than 1.7 are likely to be significant, based 
on internal quality control data. We present data which 
use a more stringent ratio, restricting our analysis to genes 
that are overexpressed or under-expressed at least 2,0 fold 
in ras^^^ElA-transformed fibroblasts relative to the emp- 
ty retrovirus-transduced MEFs. We summarize the high- 
lights bdow and present the fiill profile in Figure 2. 

Among the ~1 2,000 genes and ESTs analyzed, expression 
of 815 showed to be altered by at least 2.0 fold in the 
rasVl2/ElA-transformed fibroblasts, of which 203 corre- 
sponded to ESTs. Among known genes, 202 were up-reg- 
ulated (Table l)(see Additional file 1) whereas 410 were 
down-regulated (Table 2)(see Additional file 2) by rasVi2/ 
ElA-transformation. It is interesting to note that about 
one half of genes encoding transcription factors, signaling 
proteins, membrane proteins, channels, or apoptosis-re- 
lated proteins was up-regulated whereas the other half 
was down-regulated (Figure 2), However, after ras^is/ 
ElA-transformation most of genes encoding structural 
proteins, secretory proteins, receptors, proteases, protease 
inhibitors, extracellular matrix components, proteins in- 
volved in angiogenesis and cytosolic proteins, were down- 
regulated whereas genes encoding DNA-associated pro- 
teins (involved in DNA replication and reparation) and 
cell growth-related proteins were up-regulated (Figure 2), 
These data may explain, at least in part, the behavior of 
transformed cells. For example, down-regulation of struc- 
tural proteins, extracellular matrix componeints, secretory 
proteins and receptors is consistent with reversion of the 
phenotype of transformed cells towards a less differentiat- 
ed phenotype and up-regulation of cell growth -related 
proteins and DNA-associated proteins is consistent with 
their accelerated growth. 

Transcription faaors 

57 genes encoding transcription factors were up-regulated 
and 45 down-regulated by ras^^^/ElA-transformation. 
The most strongly activated genes corresponded to the 
homeobox protein SPXl (39 fold), myb proto-oncogene 
(25 fold) and the paired-like homeodomain transcription 
factor (19 fold), whereas the most repressed were the os- 
teoblast specific faaor 2 (123 fold), the p8 protein (51 
fold), the HID mRNA (21 fold) and the early B-cell factor 
(20 fold). 

Struaural proteins 

Expressions of 10 genes encoding structural proteins were 
up-regulated in MEFs-transformed cells, 44 being down- 
regulated. The most important up-regulation was ob- 
sensed for cytokeratin (26 fold) and desmoplakin I (17 
fold), the strongest down- regulations for smooth muscle 
calponin (115 fold), transgelin (49 fold), debrin (41 
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Figure I 

A. Expression of RAS was verified by immunoblot analysis in MEFs transduced with pBabe (control) or pBabe-ras^'^/EI A 
(transformed) retroviruses. B. Morphological aspect of the pBabe and pBabe-ras^'^/EI A transduced mouse embryonic frbrob- 
tats. C Anchorage-independent grov/th of the ras^'^/EIA transformed MEF, Fifty thousand celts were plated on 0.6% agar in 
DMEM-10% FCS and overlaid on 0.6% agar in the same medium. Photomicrographs were taken 10 days after plating. D. ras^'^/ 
El A transformed MEF induce tumor formation. One miltfon of pBabe and pBabe-ras^'^/E I A transduced mouse embryonic 
fibroblast were injeaed in 200 ^1 PBS as xenografts In nude mice. Representative mice at day 18. 
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Gene expression changes after rasVI^/ElA-transfbrmatlon. 
Number of genes up-regulated or down-regulated were 
grouped by function (Transcription factors, structural pro- 
teins, signaling, secretory proteins, receptors, protein syn- 
thesis* proteases, protease inhibitors, membrane proteins, 
extracellular matrix, enzymes, DNA-associated proteins, 
cytosoffc proteins, channels, cell growth-associated proteins, 
angiogenesis, apoptosis and unknown funaion). Bars repre- 
sent the number of genes in each group. 



teins were repressed after ras^^2^£iA_transformation. The 
most affected genes were those encoding cholecystokinin 
(112 fold), serum amyloid A3 (85 fold), PRDC (58 fold), 
insulin-like growth faaor binding protein 5 (41 fold), 
gremlin (36 fold), follistalin (33 fold), the small induci- 
ble cytokine subfamily B (27 fold), cytokine SDF-l-beta 
(23 fold) and the small inducible cytokine A7 (22 fold). 

Receptors 

8 receptors were up-regulated and 38 down-regulated in 
transformed fibroblasts. Overexpression was observed for 
acetylcholine receptor beta (8 fold), tyrosine kinase recep- 
tor (3 fold), growth hormone releasing hormone receptor 
(3 fold), semaphorin M-sema G (3 fold) and amphiregu- 
lin (2 fold). Strongest down-regulations were found for 
iniegrin alpha 5 (43 fold), transient receptor protein 2(19 
fold), retinoic add receptor alpha (14 fold), retinoic or- 
phan receptor 1 (11 fold) and platelet derived growth fac- 
tor receptor (12 fold). 

Protein synthesis 

3 genes involved in protein synthesis (BRIX, nucleoHn, ri- 
bosomal protein L44 and SIK similar protein) were over- 
expressed and 2 (ribosomal protein S4X and ribosomal 
protein L39) were down -regulated, suggesting that pro- 
tein synthesis is not suongly affeaed by uansformation. 

Proteases and protease inhibitors 

Only the kallikrein B protease and the elafin-like protein 
II protease inhibitor were up- regulated (3 and 2 fold re- 
spectively) after ras^'^^ElA-transformalion. By contrast, 
16 proteases and 7 protease inhibitors were found re- 
pressed in transformed MEFs. The tolloid-like (41 fold) 
and meltrin beta (33 fold) were proteases most down-reg- 
ulated and the tissue factor pathway inhibitor 2 (44 fold) 
and the plasminogen aaivator inhibitor (31 fold) were 
the most affeaed protease inhibitors. 



fold), p50b (35 fold) and vascular smooth muscle alpha- 
aain (34 fold). 

Signaling factors 

36 genes encoding proteins involved in numerous signal- 
ing pathways were up-regulated and 79 down-regulated in 
rasVl2/ElA-transfonne<l MEFs. The EGP314 precursor (al- 
so known as the caldum signal transducer 1) was found 
25 fold up-regulated, whereas the cysteine rich intestinal 
protein (41 fold) and ASM-like phosphodiesterase 3a (31 
fold) were the most strongly down-regulated genes. 

Secretory proteins 

Only one gene, encoding the transforming growth factor 
alpha, was deteaed as up-regulated (3 fold) in trans- 
formed cells. By contrast, expressions of 54 secretory pro- 



Membrane proteins 

15 genes encoding membrane proteins were up-regulated 

and 21 were down-regulated, Histocompability 2, D re- 
gion locus, (16 fold) and melanoma differentiation asso- 
dated protdn (9 fold) were the most overexpressed genes, 
whereas Thy-1.2 glycoprotein (36 fold), cadherin 11 (14 
fold) and vascular cell adhesion molecule 1(13 fold) were 
the most repressed. 

Extracellular matrix 

Laminin gamma 1 (20 folds) and entactin-2 (6 folds) 
were the two extracellular matrbc encoding genes found 
up-regulated during transformation, whereas 24 genes 
were down-regulated. Among them, procollagen type VI, 
alpha 1 (121 folds), procollagen type HI, alpha 1 (56 
folds), procollagen type I, alpha 1 (44 folds), procollagen 
type I, alpha 2 (37 folds), collagen type VI, alpha 3 
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subunit (21 folds) and decorin (19 folds) were the most 
affected. 

Enzymes 

Twelve enzymes involved in cellular metabolism were 
found overexpressed after ras^^^yEi^.^ransformation and 
44 were found down-regulated. The most activated genes 
were serine hydroxy methyl transferase 1 (6 fold), acetyl 
coenzyme A dehydrogenase (5 fold) and the acetyl trans- 
ferase (GNAT) family containing protein (4 fold), where- 
as the most repressed genes were lysozyme P (88 fold), 
lysyl oxydase (61 fold) and lysozyme M (55 fold). Inter- 
estingly, maximal overexpressions were 6, 5 and 4 fold, 
whereas down-regulations were 88, 61 and 55 fold indi- 
cating that in addition to the fact that more genes were 
down-regulated (44 vs. 12), change in expression was also 
more important for down-regulated genes. 

DNA'aaodaled proteins 

25 genes encoding DNA-associated proteins were up- reg- 
ulated, whereas no gene of this family was found down- 
regulated. The most strongly activated genes were nucleo- 
side diphosphate kinase (9 fold), the topoisomerase-in- 
hibitor suppressed (7 fold), the helicase lymphoid specific 
(6 fold) and the DNA2-likc homolog (6 fold). 

CytosoHc proteins 

Expression of 2 genes encoding cytosolic proteins was ac- 
tivated after ras^^2yElA-transformalion, whereas expres- 
sion of 6 genes was repressed. Genes coding for acyl-CoA- 
binding protein (3 fold) and tubulin-specific chaperone 
(2 fold) were overexpressed, whereas the most strongly re- 
pressed gene was that coding cytochrome P450 (61 fold). 

Channds 

5 genes encoding channels were up-regulated and also 5 
were dovm-regulated. Chloride channel protein 3 was the 
most up-regulated gene (11 fold) and the channel beta-1 
subunit (15 fold) was the most down-regulated gene. 

Cell growth-associated protems 

As expected for transformed cells which grow more rapid- 
ly, 13 genes encoding proteins involved in cell growth 
were found overexpressed, whereas only 3 were found 
down-regulated in ras^'^ygiA-transformed MEFs. The 
most aaivaled genes were those coding for cyclin-depend- 
ent kinase-like 2 (6 fold) and cell division cycle 7-like 1 (5 
fold) whereas the most repressed gene was cydin D2 (4 
fold). 

Angiogenesis 

Angiogenesis is a key process in carcinogenesis. Contrary 
to the expeacd for a tumoral cell, we were unable to 
found angiogenesis-associated genes up-regulated by 
ras^l^yEl A-transformation.To our surprise, all 8 genes as- 



sociated with angiogenesis showing differential expres- 
sion were down-regulated. These included genes coding 
for thrombospondins 1(15 fold), 2 (32 fold) and 3 (6 
fold), pigment epithelium-derived factor (26 fold), pleio- 
trophin (24 fold), GROl oncogene (16 fold), angiogenin- 
related protein (4 fold) and tumor neaosis factor induced 
protein 2 (3 fold). 

Apoptosis 

8 apoptosis- related genes were up-regulated in trans- 
formed MEFs and 3 down-regulated. The p53 apoptosis 
effector related to Pmp22 was the most activated gene (19 
fold) and death-associated protein 1 gene was the most 
under-expressed (4 fold) after transformation. 

Unknown function 

3 genes encoding proteins without well defined function 
were found up- regulated in mutated ra5-El A expressing fi- 
broblasts, whereas 8 were found to be down-regulated. 

As a proof-of-principle, we verified the relative expression 
levels of 11 of these 8 1 5 genes by Northern blot analysis. 
The follovdng 1 1 genes were tested : p8, transgelin, serum 
amyloid A3, lysyl oxidase, thrombospondin 2, extracellu- 
lar superoxide dismutase, biglycan, myb, cytokeratin, 
HMG2 and ezrin. In all of them Northern blot data con- 
firmed miaoarray data. The first 7 were down- regulated 
in transformed MEFs, the 4 others being overexpressed 
(Figure 3). 

Discussion 

A number of rfl5-regulated genes have been identified by 
studies on immortalized cells or cancer cells expressing 
the oncogenic ras [14-21). However, although these re- 
sults are quite interesting, it is important to note that es- 
tablished cell lines are frequently subject to genetic and 
epigenetic changes that are selected during passaging or 
immortalization and may affect ras target- gene expres- 
sion. Primary cultures, such as mouse embryonic fibrob- 
lasts, do not have that drawback. This is why, to identify 
ras target genes, we decided to analyze global gene expres- 
sion shortly after retroviral transfer of an ectopic mutated 
ras in MEFs. Yet, because activated ras alone induces MEF 
senescence instead of transformation, we associated to it 
the adenovinis-derived oncogene ElA. The ras^'^'^/ElA 
transformation of MEFs (and of other non-immmortal- 
ized cells as well) is specific and controlled. Using the Af- 
fymetrix technology on 'vl 2,000 genes, we found that 
expression of 6.8% of them was significantly modified in 
MEFs by ras^' ^yEjA-transformation. Because oncogenic 
transformation of fibroblasts allows tumor development 
when cells are injected in the immunocompromised 
mouse (see Figure 1 ), studying target genes of activated ras 
should improve our understanding of the molecular 
mechanisms by which ras transforms cells and eventually 
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Figure 3 

Confirmation of microarray resuhs by Northern blot analysis. 1 8S rRNA was used as a loading control. Total RNA Isolated 
from pBabe and pBabe-ras^' VEI A transduced MEFs were blotted onto Hybond-N membranes and hybridized with ^^P-labeled 
probes as described in Material and Methods section. 
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allows tumor formation. It is interesting to note that only 
24% of down-regulated and 40% of up-regulated genes 
showed strong modification (i.e.: >5 fold change) of its 
expression after transformation. 

Several examples of genes up- or down-regulated upon ras 
transformation have already been reported [22-25]. 
Present data on systematic analysis of about one third of 
the expressed genome confirm those reports while extend- 
ing considerably our knowledge of genes aaivated or re- 
pressed by oncogenic ras in association with the El A 
adenoviral protein. Our results may explain the behavior 
of transformed cells. For example and as expected, virtual- 
ly all of the genes coding for secreted faaors or extracellu- 
lar matrix component, which are associated with a 
differentiated phenotjqje, were down-regulated. Also, 
morphological changes observed after transformation 
(see Figure 1), may be explained by the faa that 44 genes 
encoding structural proteins were under-expressed. An- 
other expeaed result was that cell growth-related proteins 
(involved in the regulation of the cell cycle or inducing 
cell proliferation) and DNA-associated proteins (involved 
in DNA replication and reparation) were up- regulated in 
transformed MEFs, in agreement with their accelerated 
growth. Also, it is not a surprise to find an altered expres- 
sion for 56 enzymes involved in cell metabolism because, 
compared to normal fibroblasts, transformed cells show 
accelerated growth, increased migration capacity and 
strong morphological changes. These enzymes could be 
involved in some of these changes. 

Several genes coding for transcription factors (n = 102) 
and proteins involved in signaling pathways (n = 115) 
were up- or down-regulated suggesting that modification 
of the amounts of these faaors could be responsible for 
the dramatic changes in gene expression observed in 
transformed cells. It is interesting to note that approxi- 
mately as many transcription factors were up-regulated (n 
= 57) as down-regulated (n = 45). 

Besides data coherent with previous knowledge, we also 
found very unexpeaed results. For example, we found 
that genes coding for proteases and inhibitors of proteases 
were strongly down-regulated by ras^^^/ElA transforma- 
tion. This was surprising since these factors are up-regulat- 
ed and strongly involved in tumor progression involving 
mutated ras. This observation could suggest that in 
human cancers, proteases and protease inhibitors are acti- 
vated through a mechanism disconnected from ras activa- 
tion. We were similarly surprised by the fact that all 8 
angiogenic factors present on the array were found down- 
regulated by ras^^^/ElA transformation. Like proteases 
and inhibitors of proteases, angiogenic faaors are in- 
volved in tumour progression and still repressed during 
ras^^2/ElA-mediated transformation. It is therefore high- 



ly unlikely that their overexpression reported in several 
cancers is controlled by a ros-dependent pathway. Finally, 
it was also unexpeaed that only 5 genes involved in pro- 
tein synthesis were up- or under-expressed, suggesting 
that protein synthesis is not strongly altered after xas^^y 
El A transformation. 

Conclusions 

In conclusion, this study of a large number of genes has 
identified those whose expression is altered upon fibrob- 
last transformation by xas^^yElA. It is however not ex- 
haustive because the analyzed genes are only 
representative of the genome (one third of the expressed 
genes was examined), and post-transcriptional and post- 
translational modifications were not addressed. Yet, infor- 
mation gathered should be quite useful to fiiture investi- 
gations on the molecular mechanisms of oncogenic 
transformation. 

Methods 

Primary mouse embryo fibroblasts (MEFs) 

Primary embryo fibroblasts were isolated from 14.5 day- 
old SV129J mouse embryos following standard protocols 
[26]. Cell were grown in Dulbecco's modified Eagle's me- 
dium (DMEM) supplemented with 10% foetal calf serum, 
2 mM L-glutamine, 100 lU/ml penicilin G and 100 \xg/m\ 
streptomycin. 

Retroviral infection 

Oncogenic ras transforms most immortal rodent cells to a 
tumorigenic state, whereas transformation of mouse pri- 
mary cells requires either a cooperating oncogene or the 
inaaivation of a tumour suppressor gene. The adenovirus 
ElA oncogene cooperates with ras to transform primary 
mouse fibroblasts [7] and abrogates ras-induced senes- 
cence [27]. Therefore, we transduced MEFs with the 
pBabe-ras^i2/Eiy^ retroviral veaor which expresses both 
the ras^i2 mutated protein and the ElA oncogene to ob- 
tain transformed fibroblasts. pBabe-ras^^^ygi^ [described 
in ref. [27]] and pBabe (as control) plasmids were ob- 
tained from S. Lowe. Bosc 23 ecotropic packaging (10^) 
cells were plated in a 6-well plate, incubated for 24 hr, and 
then transfeaed with PEI with 5 jig of retroviral plasmid. 
After 48 hr, the medium containing the virus was filtered 
(0.45 |im filter, Millipore) to obtain the first supernatant. 
MEFs were plated at 2 x IqS cells per 35 mm dish and in- 
cubated overnight. For infeaions, the culture medium 
was replaced by an appropriate mix of the first superna- 
tant and culture medium (V/V), supplemented with 4 \xg/ 
ml polybrcne (Sigma), and cells were incubated at 37*" C. 
As a control, we evaluated the ability of the retroviral vec- 
tor to transduce MEFs by using a retroviral veaor express- 
ing the EGFP under control of the retroviral promoter 
located in the long terminal repeat. About 30% of MEFs 
expressed high levels of EGFP fluorescence 48 h after 



Page 7 of 10 
(page number not for citatton purposes) 



Molecular Cancer 2003, 2 



http://www.molecular-cancer.eom/content/2/1/19 



transduction (data not shown), indicating that retroviral 
vectors are well adapted to our experimental set-up. Retro- 
virus-infeaed cells were seleaed with puromycin (0.7 |ig/ 
ml). Transformation of MEFs by the pBabe-rasVt2/ElA 
retroviral vector was evaluated by examining changes in 
their morphological aspect, by quantifying expression of 
the RAS protein by western blot, by monitoring cell pro- 
liferation, colony formation in soft- agar and tumors in 
nude mice. In soft-agar assays, pBabe-ras^i^^Ei^ trans- 
formed cells formed colonies at high frequency (Figure 1). 
Similarly, transformed cells produced tumors in all (3/3) 
athymic nude mice when injected subcuianeously, where- 
as control MEFs did not (0/3) (Figure 1). 

Western blot analysis 

One hundred \ig of total protein extracted from cells was 
separated with standard procedures on 12.5% SDS-PAGE 
using the Mini Protean System (Bio-Rad) and iransfened 
to a nitrocellulose membrane (Sigma). The intracellular 
level of RAS was estimated by Western blot using the H- 
ras (C-20) polyclonal antibody (1:200) purchased from 
Santa Cruz Biotechnology, inc. 

Microarray 

Total RNA was isolated by Trizol (Gibco-BRL by Invitro- 
gen) . Twenty |ag of total RNA was converted to cDNA with 
Superscript reverse transcriptase (Gibco-BRL by Invitro- 
gen), using T7-oligo-d(T)24 as a primer. Second-strand 
synthesis was performed using T4 DNA polymerase and E. 
Coli DNA ligase followed by blunt ending by T4 polynu- 
cleotide kinase. cDNA was isolated by phenol-chloroform 
extraction using phase lock gels (Brinkmann). cDNA was 
in vitro transcribed using theT7 BioArray High Yield RNA 
Transcript Labeling Kit (Enzo Biochem, New York, N.Y.) 
to produce biotinylated cRNA. Labelled cRNA was isolat- 
ed using an RNeasy Mini Kit column (Qiagen). Purified 
cRNA was fragmented to 200-300 mercRNA using a frag- 
mentation buffer (100 mM potassium acetale-30 mM 
magnesium acetate-40 mM Tris-acetate, pH 8.1), for 35 
min at 94 "Q The quality of total RNA, cDNA synthesis, 
cRNA amplification and cRNA fragmentation was moni- 
tored by micro-capillary electrophoresis (Bioanalizer 
2100 by Bioanalyser 2100, Agilent Technologies). The 
cRNA probes were hybridized to an MGu74Av2 Genechip 
(Affymetrix, Santa Clara, CA). The MGu74Av2 Genechip 
represents ^^6,000 sequences of mouse Unigene that have 
been functionally characterized and ^6,000 sequences 
ESTs clusters. Each sequence in the chip is represented by 
32 probes : 16 "perfect match" (PM) probes that are com- 
plementary to the mRNA sequence and 16 "mismatch" 
(MM) probes that only differ by a single nucleotide at the 
central base (more detailed information about the 
MGu74Av2 Genechip can be obtained in the web site 
tp://vyww.afiymetrix.CPm. Fifteen micrograms of frag- 
mented cRNA was hybridized for 16 h at 45 "C with 



constant rotation (60 rpm). Miaoarrays were processed in 
an Affymetrix GeneChip Fluidic Station 400. Staining was 
made with streptavidin-conjugated phycoerythrin (SAPE) 
followed by amplification with a biotinylated anti- 
streptavidin antibody and a second round of SAPE, and 
then scanned using an Agilent GeneArray Scanner (Agi- 
lent Technologies). Expression value (signal) is calculated 
using Afiymeirbc Genechip software MAS 5.0 (for fully de- 
scription of the statistical algorithms see http://affyme- 
trix.cQm/support/lgchnical/whitepapgrs/ 
sadd whitep^p^r.p4f . Briefly, signal is calculated as fol- 
low : First, probe cell intensities are processed for global 
background. Then^ MM value is calculated and subtracted 
to adjust the PM intensity in order to incorporate some 
measure of non-specific cross- hybridization to mismatch 
probes. Then, this value is log-transformed to stabilize the 
variance. Signal is output as the antilog of the resulting 
value. The 20 probe pairs representing each gene are con- 
solidated into a single expression level. Finally, software 
scales the average intensity of all genes on each array with- 
in a data set. Final value of signal is considered represent- 
ative of the amount of transcript in solution. 

Housekeeping controls p-actin and GAPDH genes serve as 
endogenous controls and are useful for monitoring the 
quality of the target. Their respective probe sets are de- 
signed to be specific to the 5', middle, or 3' portion of the 
transcript. The 375' signal ratio from these probe sets is in- 
formative about the reverse transcription and in vitro tran- 
scription steps in the sample preparation. Then, an ideal 
target in which all transcripts was full-length transcribed 
would have an identical amount of signal 3' and 5' and 
the ratio would be equal to L Differences greater than 
three fold between signal at 3' and 5' for these housekeep- 
ing genes indicate that RNA was incompletely transcribed 
or target may be degraded. Ratio of fluorescent intensities 
for the 5* and 3* ends of these housekeeping genes was <2. 

Hybridization experiments were repeated twice using in- 
dependent cRNA probes synthezised with RNA from two 
independent sets of MEF-infected cells. Genes were con- 
sidered as differentially expressed when both hybridiza- 
tions showed >2 folds change. Data presented in this work 
represent the average of both hybridizations. The list of 
unchanged genes should be obtained from authors upon 
request. 

Validation of gene expression profiles by Northern blot 
hybridization 

Synthesis of probes: One microgram of total RNA from 
MEF cells was subjected to PGR with reverse transcription 
using the One Step RT-PCR kit (Gibco-BRL) according to 
the manufacturer's protocol to synthezise specific cDNA 
probes. PGR were canied out for 32 cycles, each cycle 
consisting in a denaturing step for 1 min at 94 "C, an 
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annealing step for 2 min at SG^Q and a polymerization 
step for 2 min at 72 "C. Selected RNA species were ampli- 
fied using the following primers: p8, sense, S'-ggagagagca- 
gactaggcata-3' and antisense, 5 'gttgctgccacccaagggcat-3'; 
transgelin, sense, 5*-ccagccagctagcagatggg-3'and anti- 
sense, 5*-gcaggcagatttagagttC'3'; serum amyloid A3, sense, 
5'-ggatgaagccttccattgcc-3' and antiscnsc, 5 -gaagagctacac- 
cgccactc-3*; lysyl oxidase, sense, 5'-taaaacgactgtccccaacc-3' 
and antisense, 5'tcacggccgttgttagtgta-3'; thrombospondin 
2, sense, 5'-aagcccagtcgggatacgg-3' and antisense, S-tgrt- 
ggagctggagccrtgc-3*; extracellular superoxide dismutase, 
sense, 5*-ccttagttaacccagaaatct-3' and antisense 5 -gtacct- 
caaaggtgctcactgg-3'; biglycan, sense, 5'-ggctgctttctgct- 
tcacagg-3' and antisense 5'-gcaactgaccatcacctccta-3'; myb 
proto-oncogene, sense, 5'-ctaaaccatttcatgaggag-3' and an- 
tisense, 5-aacaaatgcaaaattcaccc-3'; cytokeratin, sense, 5 -ct- 
ggtctcagcagattgagg-3' and antisense, 5- 
ggtaggtggcaatactgcc-3' ; high mobility group protein 2, 
sense, 5*'Cgtctgccttctgcctgttttg-3' and antisense 5'-gccctt- 
gacacggtatgcagc-3' and czrin, sense, 5'-caacgaggagaagcg- 
gatca-3' and antisense 5'-gtglgacacctgcctgcagtg-3'. 
Specificity of the PGR products was confirmed by direct 
DNA sequencing. 

Northern blot hybridization: RNA samples (10 jig) were 
submitted to electrophoresis on a 1% agarose gel and vacr 
uum blotted onto Hybond-N membranes (Amersham), 
The filters were hybridized with the ^^p^labeled probes for 
16 h at GS'^C in 5X SSPE (IX SSPE is 180 mM NaCl, 1 mM 
EDTA, 10 mM NaH2P04, pH 7.5), 5X Denhardt solution, 
0.5% SDS and 100 ^ig/ml single suanded herring sperm 
DNA. Filters were then washed four times for 5 min at 
room temperature in 2XSSC, 0.2% SDS, twice for 15 min 
at SO^'C in 0.2X SSC, 0.2% SDS, and once for 30 min in 
O.IX SSC at 50*C before autoradiography exposure on 
Kodak XrOrnat films at -80 '^C from 8 hr to 4 days. 
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As we enter an age in which genomics and bioinformatics make possible the discovery of 
new knowledge about the biological characteristics of an organism, it is critical that we 
attempt to report newly discovered "significant" phenotypes only when they are actually of 
significance. With the relative youth of genome-scale gene expression technologies, how to 
make such distinctions has yet to be better defined. We present a ''mask technology" by 
which to filter out those levels of gene expression that fall within the noise of the 
experimental techniques being employed. Conversely, our technique ean lend validation to 
significant fold differences in expression level even when the fold value may appear quite 
small (e.g. 1.3). Given array-organized expression leveJ results from a pair of identical 
experiments, our ID Mask Tool enables the automated creation of a twOKlimensional region 
of insignificance" that can then be used with subsequent data analyses. Fundamentally, this 
should enable researchers to report on findings that are more likely to be in nature truly 
meaningful. Moreover, this can prevent major investments of time, energy, and biological 
resources into the pursuit of candidate genes that represent false positives. 



I Introduction 



As we enter one of the most exciting times in the history of science, in which 
genomics and bioinformatics are coming together to make possible the discovery of 
new knowledge about living organisms at their molecular level, it is imperative that 
we avoid discovery of **truths" that are not so. While the temptation to plunge into 
tracing out metabolic pathways, cellular interactions, or genetic regulatory circuits — 
especially now that we have technologies allowing genome-wide study of RNA 
expression— is very strong, we must pause long enough to consider how best to 
report our results such that they may be meaningful. Specifically, for microarray- 
based expression technologies, whether they are glass microarrays, nylon 
membranes, or other fcmnats, we need to better understand how to distinguish 
significant fold difference values from those that fall within the noise level of the 
experiment at hand. 
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Francis Collins rightfully speculates about the large impact that microarray 
technology is likely to have, yet reminds us of the "many critically important 
questions about this new field that are yet unaddressed" [[]. Some have criticized 
array-based methods for not being model-based, or hypothesis-driven, while others 
support that the exploratory nature can lead to new hypotheses that then can be 
tested in the laboratory [2]. Especially because such hypothesis testing of candidate 
genes, cell-cell interactions, or pathways requires a major investment of time, 
energy, and biological resources, an important challenge is understanding how to 
better recognize false-positive results. 

We present a "mask technology" by which to filter out those levels of gene 
expression that fall within the noise of the experimental techniques being employed. 
Cbnversely, our technique can lend validation to the significance of fold differences 
in expression level even when the fold value may appear quite small. Our work is 
based on the notion that gene expression measurements ought to be repeatable. Fold 
differences for each corresponding pair of genes in a pair of "identical" experiments 
should therefore be equal to unity. Identical experiments are ones in which the 
operating conditions, cell lines, culture media, incubation time, and so forth are 
controlled to be the same. We first explore whether this is the case by examining 
several pairs of identical experiments. We then develop the ID Mask Tool, which 
enables the automated creation of a two-dimensional "region of insignificance" that 
can be used with subsequent data analyses. 

2 Materials and Methods 

2.1 Data Collection 

The data for this study were collected to evaluate the use of microarray technology 
for detection of ESE-1 target genes after transient transfection into different cell 
lines. We hypothesized that a transfection efficiency of greater than 70-80% should 
be sufficient to detect differences in gene expression between two samples. We first 
determined the transfection efficiency of various cell lines using a green fluorescent 
protein (GFP) expression vector. Four of the cell lines tested (HT1080, 293, MCF-7, 
and MG-63) conformed to the criteria set by us. Total UNA was isolated from MCF- 
7 human breast cancer cells and MG-63 human osteosarcoma cells transiently 
transfected with an ESE-l expression vector 20 and 24 hours after transfection. 
Experiments were performed in duplicates in order to distinguish, from gene 
expression, differences due to "biological noise." Specifically, six pairs of these 
duplicated experiments served as the source of the data that we subsequently used to 
develop the identity mask methodology. The ESE-l expression vector also 
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expressed GFP, which enabled us to confirm transfection efficiencies for each 
experiment, ^^-labeled cDNA probes reverse-transcribed from these RNAs were 
hybridized to the Atlas Human cDNA Expression Arrays from Clontech (Clontech 
Laboratories, Inc., Palo Alto, CA) [3]. Each of these Atlas Arrays (Human 1.2 I, 
Human Cancer) is a nylon membrane on which approximately 1 200 human cDNAs 
have been immobilized. The hybridization results were analyzed with the sofrware 
provided by Clontech by normalizing to the signals obtained from housekeeping 
gene controls on the same array as well as by global normalization. The microarray 
experiments were validated by RT/PCR using the same RNAs. 

2.2 Data Analysis and Mask Creation 

We developed the ID Mask Tool, a custom-designed computer program written in 
the C language, to perform mask creation. The ID Mask Tool takes as input two 
spreadsheet files corresponding to two identical experiments, along with two user 
customizable parameters to be discussed below. It returns as output an ''identity 
mask," or ID Mask, specifically for those two experiments. 

Each spreadsheet contains the names of several hundred genes and their 
corresponding brightness intensity levels (as assessed by hybridization of the probe 
of iriterest). Only genes present in both files are fiirther considered. For each of 
these genes, we calculate a *'fold difference," the ratio of the intensity in file 2 to the 
intensity in file 1 for a given gene. All fold values are then sorted based on the 
corresponding intensity values of the set of genes in the first spreadsheet file. Two 
parameters are used for creation of each identity mask: intensity range (or sliding 
window) size, plus either scale value or number of standard deviations. These are 
used to calculate the ID Mask borders and can be experimented with for better 
results. 

Two methods are then explored for creating identity masks. Method I relies on 
segmental calculation of standard deviations. A "data point" refers to an (x, y) 
pairing in which x is an intensity value from the first spreadsheet file and y is its 
corresponding fold difference value (calculated as above). Using all data points in a 
given sliding window of intensity values (e.g., from intensity level 1 00 1 to 2000), 
the standard deviation of the fold values is calculated. The average of the intensity 
values within that window is then paired with a fold value equal to the average fold 
value within that window plus the number of standard deviations specified by the 
user. This new pair becomes a candidate "upper mask border" point. Similarly, a 
candidate "lower mask border" point is created by pairing the average intensity 
value of that window with the average fold value minus the number of standard 
deviations specified by the user. Each successive group of data points in each 
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sliding window of intensity values (e.g., all points from 2001 to 3000, then all points 
from 3001 to 4000, etc.) likewise gives rise to candidate mask border points. 

The set of {intensity value, fold value) pairs comprising the candidate upper mask 
border points is then fit to a line using least-squares linear regression. This line 
defines the upper mask border. Similarly, linear regression is used to find the lower 
mask border from the set of calculated candidate lower mask border points. If one 
of the derived mask borders fits poorly (based upon relationship to original data 
points), the ^'reciprocal reflection" of the other mask border can serve in its place. 
This simply means that each (x, y) point on the good-fit (linear) border gives rise to 
a point (jc, Ify) to create the reciprocal reflection border. (See Figures 1 through 6 
for examples of mask borders. Figures 2 — 5 show ID Masks each consisting of one 
linear regression border and one border derived by taking the reciprocal values of 
that linear regression border.) The region between these borders represents the 
"identity" region of insignificant fold differences (i.e., noise). 




Figure 1: Identity mask for Experiment A. Method 2 with parameters 9000 for 
intensity sliding window size and 0.975 for scale resulted in the lowest percentage of 
original data points lying outside of the mask region (0.7%). 



Method 2 for creating an identity mask is similar to Method 1 except that candidate 
mask border points are derived from maximal (and minimal) points in each intensity 
window rather than fxom standard deviation calculations. Specifically, amongst all 
data points in a given window of intensity values, the point with the greatest fold 
value is chosen. This is repeated for each successive window of intensity values. 
These fold values can also be scaled before use in linear regression to find the upper 
mask border. The lower mask border is analogously derived from the smallest fold 
values. 
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Once the ID Mask has been derived, all original data points are checked for 
inclusion or exclusion in the identity mask region. The percentage of data points 
lying outside of the mask region is reported. 




Figure 2: Identity mask for Experiment B. Method 1 with parameters 9000 for 
intensity window size and standard deviation of 3 resulted in the lowest percentage 
of original data points lying outside of the mask region (1.7%). 

Table 1: Numbers of genes present in each of the experiment pairs, along with the 
number of genes common to both files in each pair. 





# Genes 


# Genes 


# Genes 




in r File 


in 2"" FUe 


in Both 


Expt A 


563 


559 


550 


ExptB 


292 


516 


291 


ExptC 


244 


401 


244 


ExptD 


339 


518 


326 


ExptE 


365 


397 


344 


ExptF 


233 


226 


180 



3 Results 

Six pairs of experiments were performed with Clontech nylon membrane filters and 
tumor cell lines as described in the Methods section, resulting in twelve spreadsheet 
files of genes and their corresponding expression intensity values. The ID Mask 
Tool was used to perform all mask creation experiments as well as basic data 
analysis. Table 1 displays the number of genes present in each of the file pairs, 
along with the number of genes common to both files in each pair. 
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Figure 3: Identity mask for Experiment C. Method 1 with pmrameters 9000 for 
intensity window size and standard deviation of 3 resulted in the lowest percentage 
of original data points lying outside of the mask region (2.0%). 
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Figure 4: Identity mask for Experiment D. Method I with parameters 9000 for 
intensity window size and standard deviation of 3 resulted in the lowest percentage 
of original data points lying outside of the mask region (1 .5%). 



For both Methods 1 and 2 of ID Mask creation, sliding windows of size 1000, 5000, 
and 9000 on the intensity value axis were chosen for experimentation. Only when 
calculations were not possible with one of these window sizes (e.g., due to division 
by zero) was an alternative window size chosen. For Method 1, the number of 
standard deviations (for calculation of candidate mask border points) was chosen to 
be 2.5 and 3. For Method 2, the scale factor was chosen to be 0.975 and 1 .0. 
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Figure 5: Identity mask for Experiment E. Method 1 with parameters 5000 for 
intensity window size and standard deviation of 3 resulted in the lowest percentage 
of original data points lying outside of the mask region (0.9%). 




Figure 6: Identity mask for Experiment F. Method 1 with parameters 9000 for 
intensity window size and standard deviation of 3 resulted in the lowest percentage 
of original data points lying outside of the mask region ( 1 .7%). 



Twelve candidate identity masks were created for each pair of experiments (2 
Methods, times 3 intensity window sizes, times 2 scale or standard deviation 
factors). For each pair of experiments, the ID Mask Tool selected the mask with the 
lowest percentage of original data points lying outside of the mask region. Figures 1 
through 6 show each selected identity mask along with a scatter plot of the original 
(intensity value, fold value) data points for each pair of experiments. Tables 2 and 3 
list the percentages of original data points lying outside of the mask region for each 
of the 12 candidate masks derived for each experiment pair. 



Pacific Symposium on Biocomputing 6:496-507 (2001) 



Table 2: Each pair of identical experiments gave rise to 12 candidate ID Masks. Six 
of these twelve were derived by Method 1 (three with standard deviation of 3 and 
three with standard deviation of 2,5). The other six were derived by Method 2 (three 
with scale 1.00 and three with scale 0.975). Shown are the percentages of original 
data points lying outside of the mask region for each of the 12 candidate ID Masks 
derived for each of Experiments A — C. [a= standard deviation; intensity range 
(window) size of 2000 instead of 1000 is used in Experiments B and C for the 
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Table 3: Percentages of original data points lying outside of the mask region for 
each of the 12 candidate ID Masks derived for Experiments D — F. (See caption in 
Table 2 for further details.) [a * standard deviation; intensity range (window) size 
of 3000 instead of 1000 is used in Experiment D for Method 1 trials, while range 
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4 Discussion 

DNA microarrays clearly are making a large impact on the way we approach 
problems in molecular biology and genomics. These devices are enabling the 
gcnomc-widc study of expression in Escherichia coli K-12, for example [4]. Others 
are using DNA microarrays in the study of B-cell lymphomas [5], growth control 
genes [6], and aging [7], Some researchers are focusing on developing new [8] or 
using existing [9] clustering techniques to facilitate the analysis of all the data made 
available by this relatively new technology. Few, however, have focused 
specifically on studying the properties of these array data to better understand how 
to distinguish significant from insignificant 'Tindings." 

One way we might be able to better discern meaningful discoveries from the rest is 
by applying an identity mask technology, such as the one we have presented. Our 
experiments show that greater amounts of biological noise are present at lower gene 
expression levels. Thus, there is no magical absolute cut-off for a meaningful fold 
value. There docs appear to exist, however, a "mask of insignificant values," 
outside of which the fold values are more likely to represent true significance. In 
Figure 6, for example, a fold difference of 1.5 may be meaningful at an intensity 
level of 60,000, while a fold difference of 2.5 may be insignificant at an intensity 
level of 20,000. This result is in stark contrast to a study by Incyte Pharmaceuticals 
[11], in which they conclude: "any elements with observed ratios greater than or 
equal to 1.8 should be deemed differentially expressed." A brief glance at the 
microarray-related literature will quickly confirm that others are also reporting 
particular fold difference values, such as 1.8, as significant [7]. We argue, however, 
that the significance of a fold change depends upon the intensity value; genes that 
are expressed at low levels and hence have weak intensity signals need to show a 
much greater fold difference than highly expressed genes. 

Some have proposed simple statistical tests to determine whether fold differences 
are significant; t-tests, for example, arc included in the GencSpring software 
package (Silicon Genetics, San Carlos, CA). Lee e( al. propose a statistical method 
using normal distributions and posterior probabilities to determine the likelihood that 
a gene is truly expressed in a tissue sample [12]. Methods like these are no doubt 
important; used alone, however, they may under-emphasize the correlation between 
fold values and intensity values. Future efforts might explore how to best use 
statistical validation techniques in conjunction witii the identity mask method. 

While our study used Clontcch filters, the general techniques presented for 

understanding identity masks of insignificance apply to all different types of 
expression arrays. Both nylon membrane and glass slide array techniques have their 
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individual advantages. Nylon membrane arrays have sensitive detection using 
hybridized "P probes. Glass microarrays have high-resolution fluorescent detection, 
dual labeling for hybridizing two probes on a single array, and ease in automated 
handling of slides [3]. Richmond et al. compared hybridization of radioactive 
cDNAs to spot blots on nylon membranes with fluorescence-based hybridization to 
glass microarrays; they found both methods to be reliable and reproducible [4]. 
Chen describes a colorimetry detection system for use with nylon membranes [13]. 

Regardless of the specific array format employed, it seems clear that a custom- 
derived identity mask is one method that could help improve appropriate reporting 
of fold difference results. Future work should include an exploration of fitting 
curves rather than lines for the mask borders. The upper mask border in Figure 2, 
for example, may benefit from a fitted curve, or at least a piccewise linear model. 

An alternative method for mask creation might be to always calculate fold 
differences greater than I by simply swapping the order of individual intensity 
values whenever the fold value is less than 1. Only the upper mask border would 
then need to be created. (The lower mask border would be the unity fold difference 
line.) 

It is not clear why there were some large differences between the numbers of genes 
detected in the experiment pairs of Experiments B, C, and D. These may have been 
due to experimental error or biological noise. Interestingly, the identity masks for 
these three also do not fit as nicely as those for Experiments A, E, and F. 

While we have selected from amongst the candidate identity masks those with the 
lowest percentages of points outside the mask region, future work might consider 
refining the mask fit to purposely exclude approximately 5% of the data points. This 
could be likened to p < 0.05, in which 5% of the time, we may inadvertently report a 
result as significant even though it is not. A potential benefit is a closer overall 
mask fit and therefore less likelihood to call a significant finding insignificant. 

In only one out of the six pairs of experiments did Method 2 (scaling values) 
perform better than Method 1 (standard deviations). This is possibly due to the 
mathematical basis upon which standard deviations are calculated, making them in 
general more robust and accurate. One way in which scaling actual data points can 
fail is when there exist outliers. Another is with the choice of too small an intensity 
window size. This can lead to a sort of "overfitting" problem; our group of 
candidate "maximum" points from which to derive the upper mask border may then 
contain several non-maximum values. In Tables 2 and 3, there is a definite trend of 
worsening mask fit as one decreases the intensity range (window) size fcom 9000 to 
1000. It is likely that in most applications, Method 1 may be more suitable. 
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Our aim has been to provide a foundation for evaluating fold values. The ultimate 
goal is to find truly significant fold differences when performing "treatment versus 
control" comparisons. Analyses of those types of comparisons will likely further our 
understanding of the masking technique as well. Especially because we recognize 
the use of DNA microarrays as a method by which to explore the genome in a 
model-independent fashion [10], it is imperative that we have a basis forjudging 
exploratory findings as being important or simply "in the noise." Candidate genes 
found through exploration can lead to investment of significant resources; we need 
to avoid such pursuits of fhlse positive findings. 
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